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Unit 7 


Functions of several variables 


Introduction 


You are familiar with functions such as f(x) = x? and g(t) = sin?t. You 
know how to differentiate and integrate such functions, and how to find 
their stationary points (including local maxima and minima). This is the 
usual stuff of calculus. 


In describing the real world, however, we often meet functions of more 
than one variable. The volume of a brick depends on the lengths of its 
three sides. The rate of a chemical reaction depends on the concentrations 
of each of the reacting chemicals, and also on temperature. We may 
occasionally treat some of the variables as fixed parameters, but in general 
more than one variable is of interest, and we need to discuss functions of 
several variables. In this book you will see how we can extend the methods 
of calculus to functions of two and more variables. 


The world that we inhabit has three spatial dimensions, and there are 
many physical quantities that vary throughout space. For example, the 
temperature may vary throughout a room. Since each point in the room 
can be represented by three coordinates (x, y, z), temperature is a function 
of the three variables x, y and z. We indicate this by writing the 
temperature function as T(z, y, z). 


In physics, quantities that depend on position throughout a whole region 
of space are called fields. There are many examples. We have just 
mentioned the temperature field; there are also electric fields, magnetic 
fields, gravitational fields, density fields, wind-velocity fields, and so on. 
The precise definitions of these fields need not concern us, but they all 
contain the essential idea of a physical quantity that varies with position. 


Some physical quantities are vectors, having both magnitude and 
direction. For example, the velocity of a particle is a vector describing how 
fast the particle is moving and in what direction. In describing the 
wind-velocity field, we must specify a vector (the velocity of the wind) at 
each point in space. We therefore say that the wind-velocity field is a 
vector field. By contrast, the temperature field is a scalar field because 
temperature is a scalar quantity and has no direction associated with it. 


This brief introduction has exposed two related ideas: 


e We need to extend the methods of calculus to deal with functions of 
more than one variable. 


e Nature provides many examples of functions of more than one variable 
in the form of fields that vary throughout regions of space. The fields 
that we will discuss are of two types: scalar fields and vector fields. 


With this background, we can now outline the structure of this book. 


Unit 7 introduces functions of more than one variable and shows how to 
differentiate them. This allows us to explore how sensitive these functions 
are to small changes in their variables, and to locate points at which the 
functions have maxima and minima. 


Introduction 


Unit 7 Functions of several variables 


Unit 8 describes how to integrate functions of more than one variable. For 
example, given an object of variable density, we will explain how its total 
mass can be found by integrating over the volume of the object. This is a 
so-called volume integral. We will also discuss integrals over surfaces, 
allowing us to find the surface areas of shapes such as spheres or cones. 


Unit 9 looks at scalar and vector fields, and considers how to differentiate 
them. Such fields have special properties, and this leads to results beyond 
those introduced in Unit 7. 


Unit 10 completes the book by discussing the integration of fields. Again, 
there are special results that apply to fields and take us beyond the 
integrals of Unit 8; these results turn out to be extremely powerful in 
physics and applied mathematics. 


Because much of this book is largely concerned with fields, we will 
inevitably use physical examples — more so than elsewhere in this module. 
If you are unfamiliar with physics, be reassured. You need no prior 
understanding of physics, just a willingness to accept concepts such as 
temperature or velocity when they are used to illustrate mathematical 
ideas. 


Study guide 


This unit introduces functions of two or more variables. It explains how to 
differentiate these functions, and how to put the derivatives to use. 


Section 1 introduces functions of more than one variable, and shows how 
they can be represented graphically in simple cases. It goes on to describe 
how these functions are differentiated with respect to individual variables. 


Section 2 discusses the chain rule for functions of more than one variable. 
In fact, a number of different results go under this heading. These rules 
tell us how sensitive a function is to small changes in its variables, and how 
to find the derivative of a composite function — a function whose variables 
depend on other variables. 


Section 3 briefly introduces Taylor polynomials for functions of more than 
one variable. 


Finally, Section 4 investigates maxima, minima and other stationary points 
of functions of several variables. This is an important topic because many 
of the problems met in science, applied mathematics and economics reduce 
to finding the conditions under which functions have maximum or 
minimum values. 


1 Partial differentiation 
1 Partial differentiation 


1.1 Notation for functions of one variable 


Before describing functions of two and more variables, it is worth reviewing 
the notation used for functions of one variable. As an example, suppose 
that 


z= f(x) =2? +4. (1) 


Here z is called the argument of the function f(x) or the independent 
variable, and z is called the dependent variable. If we insert a value for x in 
the expression x? + 4 on the right-hand side of equation (1), we get the 
corresponding function value. For x = 3, the function value is 

f(3) =3?+4=18. For « =a, the function value is f(a) = a? + 4. 


As noted in Unit 1, similar notation is being used for different things. 

Equation (1) exhibits the rule that defines the function — in this case, ‘take 

the square and add four’ — so we may talk about the function f(z). On the 

other hand, f(3) and f(a) represent particular values of the function. In 

spite of this clash, we will continue to use the f(x) notation (and extend it Such ‘abuse of notation’ is 


to functions of more than one variable) — it is simply too useful to avoid. practically unavoidable in 
subjects like physics, where 
Moreover, in science, the symbols used to represent functions are often mathematical symbols need to 


chosen in a special way. If we are interested in how a quantity M depends _ be related to physical concepts. 
on position x, we may denote the function that describes this dependence 

by M(a). Note that the symbol M is used for both the quantity and the 

function. We might also be interested in how M depends on time t, and 

denote the corresponding function by M(t). This notation keeps us alert to 

the fact that both functions describe how the quantity M varies, but with 

respect to different variables. It avoids cluttering our descriptions with 

arbitrary new symbols whose physical meaning may be rapidly forgotten. 


Suppose that the temperature T varies with position x along a rod. With 
T measured in degrees Celsius and x in metres, we might have 


T=1(a)=10e* torte <1, 


The symbol 7 on the extreme left is the dependent variable, and x is the 
independent variable. The way in which JT’ depends on «x is described by 
the function T(2) = 100e~*, which has x as its argument. The domain of 
this function is the region 0 < x <1. 


1.2 Functions of more than one variable 


The notation used for functions of a single variable is readily extended to 
functions of two or more variables. For example, the volume of the cone 
shown in Figure 1 is given by the formula V = $nrh, where h is the 
height of the cone and r is the radius of its base. This can be described by 
a function of two variables: 


Vinx) = sur h. Figure 1 A cone of height h 
and base radius r 
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Figure 2 The Cartesian 
coordinate system used to 
specify points on a disc 


The arguments of this function are h and r. We can also say that h and r 
are the independent variables, and that the volume of the cone, V, is the 
dependent variable. For physical reasons, h and r are both positive 
numbers (measured in metres, say) so the domain of the function V(h,r) is 
0<h<w,0<r<ow. 


The volume of the cone can be expressed in terms of other independent 
variables. The area of the base of the cone is A = mr”, so the volume of the 
cone is V = Ah, and this can be described by the function 


V(h, A) = 44h. 


Note that the symbol V has been used for the physical quantity ‘volume’ 
and for two different functions, V(h,r) and V(h, A). Fortunately, the 
distinction between quantities and functions is generally clear from 
context, and the functions V(h,7r) and V(h, A) are distinguished by the 
contents of their round brackets. Where there is a risk of ambiguity, 
guiding words will be supplied. 


A second example is provided by temperature measured over a region. 
Suppose that we measure temperature on the surface of a horizontal disc 
of radius R = 3 (in metres). We describe points on the surface of the disc 
by Cartesian coordinates x and y (in metres), with the origin taken to be 
at the centre of the disc (Figure 2). 


The temperature may vary over the disc, but each point on the disc will 
have a well-defined temperature JT, measured in degrees Celsius. We can 
represent the temperature variation over the entire surface of the disc by a 
function T(z,y). For example, we might have 


T(a,y) = 100 e(#*+¥°)/10 on the disc. (2) 


Here we use the symbol T' for the temperature function on the disc, and it 
has arguments x and y. We can also say that x and y are the independent 
variables, and T is the dependent variable. The values of x and y are 
restricted because the formula in equation (2) applies only on the disc, and 
not elsewhere. The domain of the function is therefore the surface of the 
disc, i.e. the collection of points (x,y) with x? + y? < 9. 


Given the temperature function, it is a simple matter to calculate the 
temperature at any point on the disc. For example, at the point 
(x,y) = (1,2), the temperature (in degrees Celsius) is 


T(1,2) = 100e7(1+2")/10 — 199 e~95 ~ 60.7. 


In this case, the point (2,1) has the same temperature as the point (1, 2), 
but this is an accidental feature of the function that we have chosen. In 
general, the function value f(a, b) is not the same as the function value 
f(b,a), as you will see in the following exercise. 


Exercise 1 

Given f(x,y) = 3x? — 2y?, evaluate the following. 

(a) f(2,3) — (b) f(3,-2) — (e) flab) (d) fF (0,4) 
(e) f(2a,b) (f) fla—b,0) — (g) F(x, 2) 


1.3. Graphical representations 


There are three main ways of visualising functions of two variables: 

e Give a perspective view of a surface that represents the function. 
e Show one or more slices through the surface. 

e Draw a contour map. 


All of these methods will be used in this unit, and we briefly introduce 
them now. 


Perspective view of a surface 


We can get a good overall understanding of a function f(x,y) by plotting a 
‘three-dimensional graph’. The two independent variables x and y are 
plotted in the horizontal xy-plane, and the corresponding function values 
z = f(x,y) are plotted along the vertical z-axis. Normally, the independent 
variables cover a continuous range and the function values vary smoothly, 
so we get a continuous surface, which can be viewed in perspective. 


For example, Figure 3(a) is a three-dimensional graph of the function 
f(x,y) =a? + y?. This shape is called a circular paraboloid, and is used in 
radio telescopes to focus a parallel beam to a single point (Figure 4(a)). 
Figure 3(b) is the graph of f(x,y) = x? — y?. This shape is called a 
hyperbolic paraboloid, and has gained popularity with architects because it 
is eye-catching and relatively easy to construct (Figure 4(b)). 


Figure 3. Three-dimensional graphs of (a) f(x,y) = x7 + y? and 
(b) f(z,y) =a? —y? 
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Figure 5 Part of the plane 
representing the function 


f(z,y) =#+2y4+3 


(a) 


Figure 4 (a) Radio telescopes have dishes that are circular paraboloids. 
(b) The roof of a station in Warsaw is shaped as a hyperbolic paraboloid. 


An important function of two variables is 
f(z,y) = Ax + By +C, (3) 


where A, B and C are constants. This is a linear function of x and y, and 
when we plot it we get a plane. Conversely, the equation of a plane in 
three-dimensional space is given by equation (3). For example, the plane 
passing through the points (—2, —2,—3), (0,0,3) and (2,2,9), and 
extending indefinitely, is given by the surface 


z= f(t,y)=x+2y+3. 


A portion of this surface is shown in Figure 5. 


Graphs of section functions 


A second way of visualising a function of more than one variable is to fix 
the values of all but one of its variables, and then plot a graph showing its 
dependence on the remaining variable. 


For example, given f(x,y) = x? — y?, we can set y = 3 and then plot a 
graph of f(z,3) = 2? — 9 against 2, as in Figure 6(a). Or we can set x = 2 
and plot a graph of f(2,y) =4—y? against y, as in Figure 6(b). 


ft fh 
T T T T > 54 
2 -1 0 1 Q 2 
—5 es > 
=> .-{ 0 it 2 Y 
—10- 


(a) (b) 


Figure 6 Graphs of f(x,y) = x? — y? with one variable held constant: 
(a) graph of f(z,3) against x; (b) graph of f(2,y) against y 


Figure 6(a) is obtained by taking a vertical section (or slice) through the 
three-dimensional graph in Figure 3(b). The slice is parallel to the x-axis 


at the fixed value y = 3. Similarly, Figure 6(b) is obtained by taking a 
vertical slice parallel to the y-axis at the fixed value x = 2. Any function of 
the form f(z,a) or f(b, y), where a and 6 are constants, is called a section 
function. By itself, a single section function provides limited information: 
if y is held constant, the section function tells us only about the 
dependence on z at one fixed value of y, and it tells us nothing about the 
dependence on y. Nevertheless, a series of section functions taken at 
various fixed values of y, and various fixed values of x, can provide a great 
deal of useful information. 


Contour maps 


Finally, a function of two variables can be visualised using a contour map. 
You may be familiar with this idea from topographical maps (such as those 
produced by Ordnance Survey in the UK). Here the height of the land is 
described by a function h(x,y), where x and y are position coordinates in 
a horizontal plane. The map shows a series of lines joining points of the 
same height; the example in Figure 7 shows heights between 5000 m below 
sea level and 3000 m above sea level. 


For any smooth function f(x,y), we can plot a similar contour map. The x 
and y variables (whether they are position coordinates or not) are plotted 
in the plane of the page, and a curve is drawn in the ry-plane connecting 
neighbouring points where the function has a fixed value, say f = c,. This 
is repeated for a series of values c1,C2,c¢3,..., giving a family of curves. 

A number is written beside each curve to indicate the value to which it 
refers. The curves are called contour lines, and the entire diagram is 
called a contour map. 


Contour maps for the functions f(x,y) = 2? + y? and f(x,y) = x? — y? are 
shown in Figures 8(a) and 8(b). To make sure that you are reading these 
contour maps correctly, you should compare them with Figures 3(a) 

and 3(b). 


Figure 8 Contour maps for: (a) the function f(x,y) = x2? + y?; (b) the 
function f(x,y) = 2? —y? 
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Figure 7 A contour map of 
Hawaii and the surrounding 
ocean floor 
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Figure 9 A cross-section in 
the z = 0 plane through the 
spherical contour surfaces of 


fey2aH=YVe ty te 


Figure 10 At any given 
point, the slope of a road 
depends on the direction that 
it takes over the terrain 
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Beyond two variables 


Visualisation of functions of three or more variables is harder. We can 
always plot section functions, and for a function f(x, y,z) we can plot the 
equivalent of a contour map, but in three dimensions. Here, neighbouring 
points with the same function value form a surface, known as a contour 
surface. For example, the function 


f(x,y, 2) = — 

Vetpra 
has the fixed value 1/R at points where x? + y? + 2? = R?. These points 
lie on the surface of a sphere, so the contour surfaces of this function are 
concentric spheres centred on the origin. One of these could be shown in a 
perspective view, or we could show a cross-section through many of them. 
This is what has been done in Figure 9. Note that the concentric circles in 
this case are not contour lines, but are cross-sections through spherical 
contour surfaces, which extend equally above and below the horizontal 
plane of the page. 


Don’t worry that you can’t see things in more than three dimensions — no 
one can! This does not matter for our purposes. In order to motivate the 
key concepts for functions of many variables, the vital step is going from a 
function f(x) of one variable to a function f(x,y) of two variables. Once 
this step is made, we can easily extend our methods to functions of many 
variables without any need for diagrams. 


1.4 First-order partial derivatives 


Suppose that a variable f depends on x, and this dependence is described 
by the function f(a). Then we can calculate the derivative df/dx, and this 
tells us the slope of a graph of y = f(x). Equivalently, it tells us the rate of 
change of f with respect to x. 


We would like to extend these ideas — initially to a function f(z, y) of two 
variables. Corresponding to the slope of the graph y = f(x), we have the 
slope of the surface z = f(x,y). Here we run into a new feature: the slope 
of a surface depends on our direction of travel. 


Let us imagine that the function f(x,y) represents the height of a hill, 
expressed in terms of the Cartesian coordinates x and y of points in a 
horizontal plane. Starting at some point on the hill, you might choose to 
go in a direction that goes directly up the hill, climbing steeply, or you 
might choose to go in a more oblique direction across the face of the hill, 
climbing less steeply. Roads in mountainous country often meander over 
the terrain, with many hairpin bends designed to keep the magnitude of 
the slope within safe limits (Figure 10). The slope encountered at any 
given point depends on the line chosen for the road. 


More generally, for any surface z = f(x,y), the slope of the surface 
depends on the direction in which we move. In the next section, we will 


find a way of calculating the slope in any given direction, but first, we 
concentrate on two special directions: the x-direction and the y-direction. 
For each of these directions, we can define a slope by differentiating f(x, y) 
in an appropriate way. This leads to the concept of a partial derivative, 
which we will now explain. 


Consider the surface z = f(x,y) obtained from the function 
f(v,y) =c+y> + 207y?. (4) 


Suppose that we want to find the slope of this surface in the x-direction at 
the point 7 = 2, y = 3. Because we want the slope in the x-direction, the 
value of y is fixed at the value y = 3, and we can substitute this into 
equation (4) to obtain the function 


f(z,3) =x +274 1827. 


This is the section function at constant y, with y = 3. It is a function only 
of x, and can be differentiated with respect to x to give 


£ f(a, 3) = 14 36z. (5) 


This derivative is the slope of the surface z = f(x,y) in the x-direction for 
any value of x and for y = 3. Finally, substituting x = 2 in equation (5), 
we conclude that the slope in the x-direction at the point « = 2, y = 3 is 
equal to 1+ 36 x 2 = 73. 


A similar calculation gives the slope in the y-direction at x = 2, y= 3. In 
this case we substitute x = 2 in equation (4) to obtain the section function 


f(2,y) =2+y? + 8y”. 


This can be differentiated with respect to the remaining variable y to give 


d 2 
— f(2,y) = 16y. 
ay! | »Yy) = 3y° + 16y 


Then, substituting in the value y = 3, the slope in the y-direction at the 
point 7 = 2, y=3is3 x 9+16 x 3=75. So the slopes in the z- and 
y-directions are not the same. 


We could carry out similar calculations for any values of x and y in the 
domain of f(x,y). At any given point, we can find the slope of z = f(z, y) 
in the x-direction by differentiating with respect to x while treating y as a 
constant. Similarly, we can find the slope in the y-direction by 
differentiating with respect to y while treating x as a constant. Derivatives 
of this type, where certain variables are held constant, are known as partial 
derivatives, and they are written using curly dees as 

of of 


— and — 


Ox Oy 


or equivalently, as 


Of/ox and Of/dy. 


1 Partial differentiation 


The expression Of /Ox is read as 


‘partial dee f by dee x’. 
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Partial derivatives 


Given a function f(x,y) of two variables x and y, the partial 
derivative Of /Ox is obtained by differentiating f(x,y) with respect 
to « while treating y as a constant. Similarly, the partial derivative 
Of /Oy is obtained by differentiating f(x,y) with respect to y while 
treating x as a constant. 


Both Of /Ox and Of /Oy are also called first-order partial 
derivatives because they involve a single differentiation. 


The partial derivative Of /Ox is equal to the slope in the x-direction of the 
surface z = f(x,y). It tells us the rate of change of f with respect to x 
when we move in the direction of increasing x, keeping y fixed. The partial 
derivative Of /Oy is equal to the slope in the y-direction of the surface 

z= f(a,y). It tells us the rate of change of f with respect to y when we 
move in the direction of increasing y, keeping x fixed. 


A more formal definition of first-order partial derivatives is 


a ee ee (6) 
ae by | a 


It is worth noting that the two terms in the numerator on the right-hand 
side of equation (6) have the same value of y, which corresponds to 
treating y as a constant. By contrast, the two terms in the numerator on 
the right-hand side of equation (7) have the same value of x, and this 
corresponds to treating x as a constant. 


As with ordinary differentiation, we tend to bypass these formal definitions 
and use the familiar rules of calculus to calculate partial derivatives. The 
only new feature is that some variables must be treated as constants 
during the differentiation. 


Finding a partial derivative is no harder than finding an ordinary 
derivative. Just remember that all variables except the one involved 
in the differentiation must be treated as constants throughout the 
calculation. 


Example 1 


Find the partial derivatives Of /Ox and Of /Oy of the function 
f(z,y) =2+y? + 2x7y? in equation (4). 
Solution 


To find Of /Ox, we differentiate with respect to x, treating y as a constant: 


0 

of =140+442y? =1+4zy’. 

Ox 
The effect of holding y constant differs from term to term. In the term y°, 
partial differentiation with respect to x gives zero because the derivative of 
any constant is zero. In the term 227y?, the expression 2y? is a constant 


coefficient for x”, and this coefficient is unchanged by the differentiation. 


To find Of /Oy, we differentiate with respect to y, treating x as a constant: 


0 
ce = 0 + 3y? + 4a?y = 3y? + dary. 
y 


The partial derivatives that we have just calculated are functions of « and 
y because they refer to a general point (x,y) in the domain of the 
function f. To find the value of a partial derivative at a particular point, 
we substitute appropriate values of x and y. For the function in the above 
example, we see that at the point + = 2, y = 3, the values of the partial 
derivatives are 


of =14+4x2x9=73, 
Ox r=2, y=3 

oF =3x9+4x4x3=75. 
Oy r=2, y=3 


These agree with the slopes calculated earlier using section functions. 
However, the use of partial derivatives is far more efficient because it gives 
the correct results for all values of x and y. 


We have seen that the partial derivatives of f(x,y) are functions of «x 

and y. This is sometimes made explicit by using the alternative notations 
fr(x,y) and fy(x,y) instead of 0f/Ox and Of /Oy. In this case, the 
subscript x or y indicates the variable with respect to which the derivative 
is being taken. At a particular point, where x = a and y = 5, the values of 
the partial derivatives are denoted by f;(a,b) and fy(a,b), which is more 
compact than curly dee notation. 


Of course, the two independent variables need not be denoted by x and y; 
any variable names will do, as the next example shows. 


Example 2 

Given f(u,v) = u? + sin(uv), calculate fu(%,1) and fo(3, Ly 

Solution 

Differentiating f(u,v) partially with respect to u gives 
fu(u,v) = 2u + vcos(uv), 

so 


fu($,1) = 7+ cos(Z) =m. 


1 Partial differentiation 


fr(x,y) and fy(x,y) may be 
abbreviated to f, and fy. 


13 


Unit 7 Functions of several variables 


This differentiation uses the 
‘function of a function’ rule. 
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Differentiating f(u,v) partially with respect to v gives 
fo(u,v) =0+ ucos(uv), 
sO 


fo($, 1) = $ c0s($) =0. 


Exercise 2 

Given g(0,¢) =sin# + cos ¢ tan 0 faa 22 n og 
en g(6,¢) =s cos ¢ tan 0, 79 2 a6. 

Exercise 3 


(a) Given f(x,y) = (a? +y")e**, find fr(x,y) and fy(x,y). 


(b) What is the slope of the surface z = f(x,y) in the x-direction at the 
point (0,1)? 


The concept of a partial derivative can easily be extended to functions of 
more than two variables. For a function f(x,y, z) of three variables, Of /Ox 
no longer represents the slope of a surface, but it does represent the rate of 
change of f with respect to « when the other variables y and z have fixed 
values. The partial derivative Of /Ox is calculated by differentiating 

f(x,y, z) with respect to x while keeping y and z fixed (and similarly for 
the other partial derivatives). More generally, we keep all but one of the 
variables fixed and differentiate with respect to the remaining variable. 


Example 3 
Given V(x, y, z) = (x2 + y? + 2?)~1/?, calculate OV/Ox, OV/Oy and OV/dz. 
Solution 


Partial differentiation with respect to x, with y and z constant, gives 
ov 1/2 2 2)\—3/2 x 
—=-s(e ty tz) x Qa = —-—__-____.. 
Ox 3 y ) (a? + y? + 22)3/2 
Partial differentiation with respect to y, with x and z constant, gives 
OV y 
Oy (x2 + y2 + 22)3/2” 
and partial differentiation with respect to z, with « and y constant, gives 
OV 1/2 2 2)\—3/2 z 
Ctl = Doce ; 
De se FU ee) x 2z CET PE 
(Because the function V(x, y, z) is symmetric in x, y and z, the answers for 


OV/Oy and OV/0z can be obtained from the answer for 0OV/0x by 
interchanging symbols. This is a useful check in this case.) 


=f P+ 29? x ay = 


Exercise 4 


(a) Given f(z,y,z) =(1+2)?+ (14+ y)?+ (14 z)4, calculate the partial 
derivatives Of /Ox, Of /Oy and Of /0z. 


(b) Find all the first-order partial derivatives of the function 


(29.5) = zy t* shige = Qry + y. 


1.5 Higher-order partial derivatives 


You have seen that the first-order partial derivatives of a function f(z, y) 
are themselves functions of « and y. For example, the first-order partial 
derivatives of the function f(x,y) = x+y? + 2x?y? are 

0 6) 

ol =1+4zy? and . = 3y” + 427y, (8) 


as you saw in Example 1. We can go on to differentiate Of /Ox partially 
with respect to x to obtain 


aOfO;\: @ oe All 
5 (Be) = se dev?) = ay? 


Each of 0f/Ox and Of /Oy can be partially differentiated with respect to 
either x or y, so equations (8) also give 


(54) = ? 1 4 dey?) = 8ry, 


dy \ Ox Oy 

oe ei) ee iA, 
(3) = 5g (3Y + 4z*y) = 8xy, 

0 (0 0 

=(34) = By oY + da*y) = 6y + 42”. 


These four functions, obtained by partially differentiating 
f(x,y) =2+y° + 2x77? twice, are called second-order partial 
derivatives. 


Second-order partial derivatives are written down using a natural 
extension of the notation for first-order partial derivatives. In curly-dee 
notation, we define 


ay _a (at) af _ a (ar 
Ox? Ox \ Ox)’ = OyOx Oy\ Ox)’ 


af-2 (2) Of _ o (dF 
Oy2  Oy\ dy)’ Axdy OdAx\ dy)’ 


and in the alternative subscript notation, 


a a? 
fex(2,y) = ae fyx (x, y) = ae 
af orf 


Fyy(Z,Y) _ Oy?’ Fey (x,y) = Ox Oy 


1 Partial differentiation 
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The value of the partial derivative f,.(x,y) at a particular point x = a, 
y = bis then written as f,.(a,b), with similar notation for the other 
second-order partial derivatives. 


If you can partially differentiate once, then you can partially differentiate 
again, so you should have no problem in calculating second-order partial 
derivatives. The following example illustrates the technique. 


Example 4 


Determine the second-order partial derivatives of the function 
f(a,y) = e® cosy+a27—y+l. 

Solution 

We have 
Of Of 


—=e*cosy+2x, —=-—e*siny—1, 


Ox Oy 


sO 


po 0 
of — Fue cosy + 22) =e* cosy +2, 
Me a 


at = ay cosy + 27) = —e” siny, 
oT 

Oy? Oy 
Of 
Ox Oy 


(—e” siny — 1) = —e* cosy, 


a) 
= ag siny — 1) = —e* siny. 


Exercise 5 


Given f(x,y) = xsiny, calculate the second-order partial derivatives fy, 
fyx, fyy and fry, and evaluate them at (2,7). 


The second-order partial derivatives of a function f(x,y) can be 
interpreted geometrically. Just as first-order partial derivatives tell us 
about the slopes of the surface z = f(x,y) in the z- and y-directions, so 
second-order partial derivatives tell us about the rates of change of these 
slopes when we move in various directions. In a derivative like fy., we 
differentiate twice with respect to x, holding y constant. The first 
differentiation gives us the slope of the surface in the x-direction. In 
general, this slope varies from point to point, and f,, tells us how rapidly 
the slope in the x-direction changes as we move in the x-direction. 
Similarly, fy, tells us how rapidly the slope in the x-direction changes as 
we move in the y-direction, and so on. Figure 11 illustrates these facts. 


1 Partial differentiation 


Figure 11 (a) At point P, fz >0 and fy > 0, while frz > 0 and fyy < 0. 
(b) At point Q, fz > 0 and fy > 0, while fz, >0 and fy, > 0. Mixed 
partial derivatives such as f;, and fy, are non-zero for surfaces with a 
‘twist’ in them. 


Partial derivatives such as fz, and fy,, which involve differentiations with 
respect to different variables, are called mixed partial derivatives. In 
Example 4 and Exercise 5 (as well as in the work on the function at the 
beginning of this subsection) you can see that fry = fyz. In fact, this 
property is guaranteed by the following theorem, which we do not prove. 


Mixed partial derivative theorem 


For any function f(x,y) that is sufficiently smooth, 


Oy oy 
——— = ~——.,_ or equivalent] — ae 
A similar result applies to a smooth function f(x1,22,...,2n) of 
n variables. Again, the order of differentiation does not matter, so 
OF OF 
ui = sees for all x; and 2;. When x; = 2; this is a trivial 
Ox, Ox; Ox; Ox; identity. The interest lies in the 


case xj A Xj. 
You do not need to know precisely what is meant by ‘smooth’ in this 


context. In fact, we will assume that the mixed partial derivative theorem 
applies to all the functions that you will meet in this module. 


Exercise 6 
Given f(a, y) = e214, calculate fry(0,0) and fyx(0, 0). 


Exercise 7 


Given f (x,t) = cos(3x — 2t), find expressions for fy, and fi. Hence write 
down a relationship between f,, and fi that applies for all values of x 
and t for the given function. 
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Note how the notation is used: 
df /dx is a derivative, but df /dx 
is a quotient of two small 
quantities, df and 6a. 


The key result is given in 
equations (18) and (19) below. 
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2 Chain rules and the gradient vector 


This section describes a number of rules known as chain rules. One of 
these rules will allow us to find the slope of a surface when we move in an 
arbitrary direction (not just parallel to the x- or y-axis). In the process, we 
will use partial derivatives to define a quantity called the gradient vector. 
This will be important later in this unit, and for later units in this book. 


2.1 The chain rule for small changes 


Suppose that a variable f depends on a single variable x, and this 
dependence is described by the function f(2). Then we can write 


GE ig Wl, (9) 
dx 0x 
provided that 6x and the corresponding df are small enough. Rearranging 
this equation gives an approximate formula for the change in f that 
accompanies a very small change in x: 


of ~ oe. (10) 


We therefore see that df /dx determines the sensitivity of f to small 
changes in x. This is important when x is found by measurement and the 
function value f(a) is deduced from it: if the derivative df/dx has a large 
magnitude for the value of x of interest, then we need to be very accurate 
in our measurement of x in order to make a reasonable estimate of f. 


We would now like to extend this idea to a function f(x,y) of two 
variables. In this case, we are interested in the change in f that arises 
when x and y both change by small amounts. 


Figure 12 shows a contour map of a function f(z, y). 


~ *% 
7.8 
so vee _e(xo + dx, yo + dy) 
A 
7.6 


~~ r) 
7.5 
\ 
TA 
(xo, yo) e& a (xo + 62, yo) 
P\ 0% Q 


Figure 12 A contour map of a function f(x,y). We consider the change 
in the value of f as we move from P = (20, yo) to a neighbouring point 
R= (xo + 6x, yo + dy). 


2 Chain rules and the gradient vector 


We consider a point P in the xy-plane with coordinates (9, yo), and a 
neighbouring point R with coordinates (xo + 62, yo + dy). The change in f 
between these two points is 


Of = f(xo + da, yo + dy) — f(%0, Yo). (11) 


The journey from P to R can be taken in two steps. In the first step, the 
value of y is held constant as we move from P to an intermediate point Q 
with coordinates (xp + 62, yo). In the second step, the value of x is held 
constant as we move from Q to R. 


The change in the value of f in going from P to Q is 


dfi = f(xo + 6x, yo) — f(x0, yo), (12) 
while the change in f in going from Q to R is 

fz = f(xo + 6x, yo + dy) — f (wo + 4x, Yo). (13) 
Adding equations (12) and (13), and comparing with equation (11), we see 
that 

of = ofi + fe. (14) 


You can check this for the case drawn in Figure 12: in this case, 
of =7.8—7.4=0.4, of; = 7.6 — 7.4 = 0.2 and dfo = 7.8 — 7.6 = 0.2. 


More generally, we can say that the change in f is cumulative. It is the 
sum of the change 6f; obtained when moving in the x-direction and the 
change 6f2 obtained when moving in the y-direction. Fortunately, we can 
estimate df; and 6f2 quite easily. 


The journey from P to Q starts at the point (xo, yo) and makes a 
displacement 6x with y held constant. The resulting change in f is 
of = 6f;. By analogy with equation (9), we can therefore write 


Oh gg 

Ox” Ox 
A partial derivative appears on the left because the journey is made with y 
held constant. This partial derivative is evaluated at point P, and so can be 
written as f,(o, yo). Rearranging the equation, we therefore conclude that 


Of1 ~ fa(Xo0, yo) dx. (15) 
which is very like equation (10). 


A similar argument can be given for the journey from Q to R. In this case, 
we start from the point Q, with coordinates (zp + 62, y), and make a 
displacement dy with x held constant. In this case, we get 


Of ~ fy(xo + 6x, yo) dy. (16) 


In fact, this expression is needlessly complicated. If the function f, varies 
smoothly, and dx is very small, then we can replace fy(xo + 62x, yo) by 
fy(xo, yo). You might think that this would introduce a small error, but 
this really does not matter. It turns out that the difference between 
fy(o + 6%, yo) and fy(%o, yo) is proportional to 6x, and because the last 
term in equation (16) includes a factor dy, the overall error introduced by 
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20 


the replacement is proportional to 6x dy. Such an error can be neglected in 
comparison to the terms that we are keeping. It is good enough to say that 


Of = fy(Zo, Yo) dy- (17) 


Finally, using equation (14), we see that the total change in f in going 
from a point (xo, yo) to a neighbouring point (a9 + 62, yo + dy) is 


Of = fx(xo, Yo) 6x + fy(Zo, Yo) dy. 


This equation is valid throughout the domain of the function, so we can 
replace (x0, yo) by (x, y) to obtain our final result, the chain rule. 


Chain rule for small changes 


If f(x,y) is a smooth function, then the change in the dependent 
variable f that occurs in response to small changes in x and y is 


Of ~ fr(x,y) dx + fy(x, y) dy. (18) 
Using curly dee notation, this rule can also be written in the form 
Of of 
OF @ =o — dy. 19 
fos de + By oY (19) 
Example 5 


The power P output by a star is given by the formula P = KAT“, where A 
is the star’s surface area, T is its surface temperature, and k is a constant. 
Derive a formula relating the fractional change in power output 6P/P to 
the corresponding fractional changes 6A/A and 6T/T. You may assume 
that 6A and 6T are small enough for the chain rule to apply. 


Solution 


The first-order partial derivatives are 


a =kT* and — = AkAT®, 
so the chain rule gives 
6P = cf dA + as 6T 
OA OT 
= kT* 5A + 4kAT® OT. 
Dividing both sides by P = kAT*, we get 
d6P OA oT 
pT + a7 


The coefficient 4 multiplying 67'/T shows that a 1% increase in 
temperature produces an approximately 4% increase in power output. 


2 Chain rules and the gradient vector 


Exercise 8 


A quantity f is given by the function 
LY 
= ay — ‘5 
f= fey) =F 7 


Find Of /Ox and Of /Oy, and use the chain rule to estimate the small 
change in f that occurs when x changes from 1.00 to 1.01, and y changes 
from 4.00 to 3.99. 


2.2 Other versions of the chain rule 


In ordinary calculus, we often have to differentiate ‘a function of a 
function’. Such a situation arises if the velocity v = u(x) of a particle is 
given as a function of its position « = x(t), which in turn is a function of 
time t. We can then write 


P]=v) = wat) ). 
You know how to differentiate v with respect to ¢ in such a case. We use 
the ordinary chain rule of calculus: 


dv dv dx 

ae 2 

dt da dt 2q) 
So if v(x) = 2? and x(t) = sint, we have 

dv dv dx 

—=—x —_=?2? ¢=2sint t 

oe x a x X COS sint cost, 


where we have used x = sint in the last step. We could, of course, get the 
same result by noting that v = x? = sin? t, and then differentiating sin? t 
directly; this makes implicit use of the chain rule, but we have chosen to 
be more explicit. 


In this subsection, we will look at various generalisations of equation (20) 
to functions of more than one variable. All these results will be based on 
equations (18) and (19), and they are all called chain rules. 


Differentiation with respect to a parameter 


Suppose that x and y are both functions of the same variable t. For 
example, we could have 


x=Rcost and y=Rsint, (21) 


where R is a constant and 0 < t < 27. The variable t is called a 
parameter. It may represent time, but this is not essential; ¢ could be 
any quantity that increases as we trace out a given path. 


The fact that x and y are both functions of t implies that they are related 
to one another. In the present case, x and y both lie on a circle of radius R 
centred on the origin, as indicated in Figure 13, and we say that 

equations (21) provide a parametric representation of this circle. For a 
given value of t, they specify a definite point on this circle, and as t 
increases from 0 to 27, we go once around the circle anticlockwise. 


Note that 1/f = 1/a% + 1/y. 
Relationships like this occur 
when describing electrical 
circuits and optical lenses. 


Figure 13. The parametric 
representation « = Rost, 
y= Rsint, 0 <t < 2a, for 
points on a circle traced 
anticlockwise 
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Figure 14 Contour map of a 
rock pinnacle (orange) with 
the projection on the 
xy-plane of the path followed 
by a mountaineer (blue) 
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Suppose that we are given a function f(x,y) with x and y related to the 
parameter t by equations of the form x = x(t), y = y(t), so that 


f = f(x(t), y(t). 
Then we can ask: what is the rate of change of f with respect to t? The 


answer is easily found by dividing both sides of equation (19) (the chain 
rule for small changes) by a small change dt: 


Of Of du , Of dy 


ot Ox ot | Oy Ot’ 


and then taking the limit as d¢ tends to zero. This gives the following 
version of the chain rule. 


(22) 


Chain rule for differentiation with respect to a parameter 
If f = f(a(t),y(t)), then 

df Of dx Of dy 

dt Ox dt Oy dt’ 


(23) 


It is worth reflecting on the notation used here. Because x(t) and y(t) are 
functions of a single variable, we use the ordinary derivatives dx/dt and 
dy/dt, written with straight dees. By contrast, f(x,y) is a function of two 
variables, so we use the partial derivatives Of /Ox and Of /Oy, written with 
curly dees. Neither of these should be confused with ratios of small 
quantities, such as 6a/dt or dy/dt, which appear in equation (22). 


Equation (23) is a direct extension of equation (20) to functions of two 
variables. We could, of course, substitute the functions x(t) and y(t) 
directly into f(x,y) and carry out an ordinary differentiation with respect 
to t. Nevertheless, we will use equation (23) in the following example and 
exercises because this version of the chain rule is a valuable result with 
many uses and it is important to become familiar with it. 


Example 6 


With distances measured in metres, the height h of a rock pinnacle 
depends on horizontal coordinates x and y according to the function 


h(x, y) = 1000 — 32? — 2ay — 4y/? 
for -10 <a < 10 and -10<y< 10. 


A mountaineer follows the blue path shown in Figure 14, with her x- and 
y-coordinates given by the functions 


x(t) =5cos(2t) and y(t) =5sin(2t) for0O<t< 7/2. 


Use the chain rule to calculate dh/dt as a function of t. 


2 Chain rules and the gradient vector 


Solution 


The chain rule (in the form of equation (23)) tells us that 
dh _ Oh do, Oh dy 
dt Ox dt Oy dt 


The required partial derivatives are 


Oh Oh 
<——_6r—2y, —=—2r-—8 
Or 6x Y; Oy x Y; 
while the derivatives of x(t) and y(t) are 
dx ; dy 
a Osin(2t), Ff 0 cos(2t) 
We therefore obtain 
h 
= = (—62 — 2y)(—10 sin(28)) + (—2a — 8y) (10 cos(2t)) 


= 20((32 + y) sin(2t) — (a + 4y) cos(2t)). 

Finally, we use the parametric equations for x and y to express everything 
in terms of t. Direct substitution gives 

dh 
a 100(3 cos(2t) sin(2t) + sin?(2t) — cos?(2t) — 4 sin(2t) cos(2t)) 

= 100(sin?(2t) — cos?(2t) — sin(2t) cos(2t)). 
Although it is not essential to do so, we could simplify this answer using 
trigonometric identities to obtain 

dh 


a —50(2cos(4t) + sin(4t)). 


The final answer is the rate of change of h with respect to the parameter t. 


The significance of this parameter is left open in this question: it could 
represent time, in which case dh/dt would be the rate of change of height 
with respect to time, but this is not essential — in general, t could be any 
variable that increases along the path. 


Exercise 9 


Given z = sinx — 3cosy, use the chain rule to find the rate of change of z 
with respect to t, where x and y are given by the parametric equations 
g = t? and y = 2t. 


Exercise 10 


Given z = ysin x, use the chain rule to find the rate of change of z with 
respect to t along the curve (x(t), y(t)), where 2 = e' and y = t?. Evaluate 
this rate of change at t = 0. 
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| 
| 
| 
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= 
rcos@ # 


Figure 15 Polar coordinates: 


a point is defined by the 
distance r and the angle 0 


This is an example of our 
deliberate ‘abuse of notation’. 
T(x,y) and T(r, 6) are different 
mathematical functions 
representing the same quantity, 
namely temperature. 
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The chain rule for a change of variables 


At the start of this unit we used the example of temperature measured on 
the surface of a circular disc. This was described by a function T(x, y) of 
the Cartesian coordinates x and y of points on the disc. But there is no 
compelling reason to choose Cartesian coordinates. We could equally well 
use polar coordinates r and 0, which are defined in Figure 15. From 
Figure 15, you can see that x and y are related to r and @ by the equations 


x=rcos@é and y=rsiné. 


These equations imply that x7 and y are functions of r and 6, and we can 
express this by writing 


eH ar,0) and = y(r,0). 
The temperature function can then be regarded as a function of r and 0: 
T(r,0) = T(2x(r, 6), u(r, 9)). 


We can ask: what is the rate of change of JT’ with respect to r when @ is 
held constant? This is given by the partial derivative OT /Or, and we will 
now explain how to calculate this partial derivative from expressions for 
T (x,y), «(r,@) and y(r,@) using another form of the chain rule. 


Instead of using the r and @ of polar coordinates, let us consider a more 
general situation. Suppose that we are given a function f(x,y) where the 
variables x and y are expressed in terms of two other variables, u and v, so 
that x = z(u,v) and y = y(u,v). Then we can write 


f= f(xtu,v), y(u,v)), 
and regard f as a function of u and v. 


If x and y change by small amounts, we know that 


Of of 


if = 5 ba + by. 


a) 


Now divide both sides of this equation by du, under conditions in which v 
is held constant. This division gives 


Of Of dx Of dy 
du Ox dbus Oy ~bu’ 

and if we now take the limit as du tends to zero while holding v constant, 
then the ratios of small quantities become partial derivatives. This leads to 
another form of the chain rule, used when we change variables from (z, y) 


to: (u,v). 


Chain rule for a change of variables 
ef fase) wath — 2) andy — yo). then 


Of af de . Of dy 


Bn oe on oe eo 


2 Chain rules and the gradient vector 


By a similar argument, but dividing by 6v while holding u constant, 
af _ af dx , af ay 
Ov Ox Ov Oy Ov 
In these equations, f(x,y) is a function of x and y, and Of /Ox implies that 
y is held constant, while z(u,v) is a function of u and v, and Of /Ou 


implies that v is held constant, and so on. The best way to see how this 
works is with an example. 


(25) 


Example 7 


Suppose that f(x,y) = ry?, where x = uv and y = u? — v?. Use the chain 
rule to find Of /Ou in terms of u and v. 


Solution 
We have 
Of _ 9 of 
—_ = a 
Aa y and Dy xy 
Also, 
Ox Oy 
SS a es Day, 
Cee and Fu U 
So the chain rule (equation (24)) gives 
6) 
ol = vy? + 4ury = o(u? — v)? + 4u?0(u? — v?), 
u 
which can (optionally) be tidied up to give 
a) 
ol = v(u? — v*)(5u? — v?). 
Exercise 11 


Suppose that f(x,y) = 2? + y”, where x = 2u + 3u and y = 3u — 2v. Use 
the chain rule to find the partial derivatives Of /Ou and Of /Ov, and 
evaluate your answers at (u,v) = (1,2). 


Exercise 12 


Suppose that f(x,y) = 2? — y”, where x = rcos@ and y = rsin@. Use the 
chain rule to find the partial derivatives Of /Or and Of /00. 


2.3 Slope in an arbitrary direction 


You saw earlier that the slope of a surface z = f(x,y) depends on the 
direction of travel. The slope in the x-direction is given by the partial 
derivative 0f /Ox, and the slope in the y-direction is given by Of /Oy, but 
what about the slope in an arbitrary direction? 
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Figure 16 A direction in the 
xy-plane, starting at (9, yo) 
and heading at an angle @ to 
the a-direction 
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We can answer this question using the chain rule, but first we must say 
what happens to the x- and y-coordinates when we move away from a 
given point (0, yo) in a given direction. 


In Figure 16, the arrow represents a direction in the xy-plane, and s is the 
distance travelled away from a starting point (Zo, yo) in this direction. By 

simple trigonometry, we see that as we move away from (20, yo), the z- and 
y-coordinates vary as 


L=X+scosé, y=yotssing, (26) 


where @ is the (smaller) angle between the given direction and the positive 
x-direction. For a given direction, @ has a fixed value, so equations (26) 
define functions of the form x = x(s) and y = y(s). In other words, they 
are parametric equations describing a straight-line journey away from 

(xo, yo) in terms of the parameter s. 


Substituting x = x(s) and y = y(s) in the function f(x,y) gives a function 
f(x(s), y(s)) of s. We can therefore use the chain rule in the form of 
equation (23) to write down the rate of change of f with respect to s: 

df Ofdx Of dy 

ds Oxds Oy ds’ 
The derivatives dx/ds and dy/ds are immediately found from 
equations (26), so we conclude that 


fof, , oF 

ds Or Oy 
But what does df/ds mean? It is the rate of change of f, the height of the 
surface, with respect to the parameter s. In this case, however, the 
parameter s is the distance travelled in the chosen direction. It follows that 
at any given point, df/ds is the slope of the surface in the chosen direction. 


sin 6. 


We conclude that at a point (x,y), the slope of the surface z = f(x,y) ina 
chosen direction is 


slope = fz (x,y) cos6+ fy(x, y) sin 4, (27) 
where @ is related to the chosen direction as shown in Figure 16. 


Rather than referring to Figure 16, it is convenient to specify our chosen 
direction more compactly. The unit vector in the chosen direction is just 


nh = cos@i-+ sin6j, 


and this has components Nz = cos@ and Ny = sin @, so we can restate our 
conclusion as follows. 


Slope of a surface in an arbitrary direction 


At any point (xz, y), the slope of the surface z = f(x,y) in the 
direction of the unit vector n is 


slope in n-direction = fiz fz(x,y) + Ny fy(z, y). (28) 


2 Chain rules and the gradient vector 


We will rephrase and apply this result in the next subsection, using the 
important concept of gradient. 


Exercise 13 
Check that equation (28) makes sense in the following special cases. 
(a) n points in the x-direction 


(b) n points in the y-direction 


2.4 The gradient vector and maximum slope 


For a given surface z = f(x,y), the slope at a given point depends on the 
direction chosen for n. In one direction, the slope is greatest, and this 
corresponds to climbing straight up the hill. In another direction, the slope 
is zero, and this corresponds to skirting the hill, following a contour line. 


To investigate the various slopes that can be obtained, it is very useful to 

define a new quantity. We take the first-order partial derivatives f, (2, y) 

and fy(#,y) and form the vector f,(x,y)i+ fy(x,y)j. This is called the 

gradient vector, and is denoted by grad f. Note that grad is printed in 
bold because the gradient is a 
vector function. In your written 


The gradient vector work you should underline it. 


Given a function f(x,y), the gradient vector or gradient of f is 
defined by 


a) 0 
grad f = feleu)it fyles)i= ait i (29) 


This is a vector-valued function of x and y. At a given point (a,b), it 
is a particular vector: 


[grad Teast = (GeO) tet WAGON (30) 


Exercise 14 


Calculate the gradient of the function f(x,y) = ry’, and evaluate it at the 
point (1,2). 


According to definition (29), the components [grad f], and [grad f], of In this context, [grad f], and 


the gradient vector are just the first-order partial derivatives f;, and fy, [grad f], are the z- and 
and this allows us to rewrite equation (28) as tere aa of the vector 
grad f. 


slope in n-direction = 7; [grad f]; + ny [grad fl, 
n- grad f. (31) 


The right-hand side is the scalar product of two vectors, n and grad f. 
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Recall from Unit 4 that the 
component of a vector a in the 
direction of the unit vector ni is 


given by the scalar product n- a. 
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Because Ni is a unit vector, we can say that the slope in the direction 
of n is equal to the component of grad f in the direction of n 
—a simple and memorable result. 


Example 8 


Calculate the slope of the function f(z, y) = ry? at the point (1,2), in the 


direction of the unit vector n = 3j i- +j j. 


Solution 


The gradient of the function is 


OF OT . P : 
grad f = OF gy: OF 5 = pit 3zy?j. 
Ox Oy 
At any point (x,y), the slope in the direction of the unit vector n is 


n- grad f = (2i- 33) - (y®it 3ay7j) = 2y? — Bay’, 


so at the point (1,2), the slope is 24/5 — 48/5 = —24/5 = —4.8. 


Exercise 15 


Calculate the slope of the function f(x,y) = 2x?y? + 3ry? at the point 
(1,2), in the direction of the (non-unit) vector i—j. 


A note on terminology 


When dealing with the graph of a function of one variable, we use the 
words slope and gradient interchangeably. For a function of two 
variables, we must be more careful. We refer to the slope in a 
particular direction. The gradient is the gradient vector in 

equation (29). The component of the gradient in a given direction is 
equal to the slope in that direction, but the gradient vector is not 
itself a slope. 


We can use the properties of scalar products to deduce some properties of 
the gradient vector. Any scalar product a+b can be written as |a| |b| cos a, 
where a is the (smaller) angle between the directions of a and b. Since n 
is a unit vector, with |n| = 1, equation (31) gives 


slope in n-direction = |n| |grad f| cosa = |grad f| cosa, (32) 


where a is the angle between the directions of n and grad f. From this 
equation we can gather a rich harvest. 


e The slope varies with a, and the maximum slope arises when cosa@ = 1 
and a = 0. This occurs when n points in the same direction as 
grad f. So the maximum slope is found in the direction of the vector 
grad f. Put another way, the direction of grad f has a simple 
interpretation: it is the direction of maximum slope. 


2 Chain rules and the gradient vector 


e Setting a = 0 in equation (32), the maximum slope is equal to 
\grad f|. So the magnitude of grad f is equal to the maximum slope. 


e Along a contour line, the function values are constant and the slope is 
zero. So if n points along a contour line of f, the slope must be equal 
to zero. However, equation (32) shows that zero slope corresponds to 
cosa = 0 and a = 7/2, which tells us that n and grad f are 
perpendicular. Hence grad f is perpendicular to the contour lines of f. 


The properties of the gradient vector may be summarised as follows. 


Properties of the gradient vector 
For a smooth surface z = f(x,y), at any given point (z, y): 
e the gradient vector grad f is a vector in the xy-plane 


e grad f points in the direction of maximum slope, and its 
magnitude is equal to the maximum slope 


e grad f is perpendicular to the contour line at the given point. 


To illustrate these properties, consider the function f(x,y) = x? + y’, 
plotted in Figure 17(a). In this case, the gradient vector is 


grad f = 271+ 2yj, 


and this defines a vector at each point in the xy-plane. Figure 17(b) shows 
arrows representing the gradient vectors at a selection of points. 


Figure 17 (a) A three-dimensional graph of f(x,y) = 27+ y?. (b) A map 
in the ry-plane with arrows representing grad f at a selection of points. 
Contour lines of f(x,y) are shown in orange. 


In this case, grad f points radially away from the origin, and its 
magnitude increases with radial distance from the origin. This makes sense 
because the slope is steepest in the radially outward direction, and the 
maximum slope grows as we move outwards. Figure 17(b) also plots the 
contour lines, which are circles centred on the origin. As expected, the 
contour line through any given point is perpendicular to the gradient 
vector at that point. 
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Exercise 16 


Given a function f(x,y), how would you characterise the direction in the 
xy-plane that gives the steepest decrease in f(x,y) at a given point (a,b)? 


Exercise 17 


A bug in the xy-plane finds itself in a toxic environment. The level of 
toxicity is given by the function f(x,y) = 2x?y — 32°. The bug is at the 
point (1,2). In what direction away from (1,2) should it initially move in 
order to lower its exposure to the toxin as rapidly as possible? Specify the 
direction as a unit vector. 


2.5 The chain rule beyond two variables 


So far we have focused on functions of two variables. For functions of three 
or more variables, it is really a case of ‘more of the same thing’. For 
example, if f = f(x,y, z), the chain rule for small changes becomes 
Of of 
of = } 

f Or Oz 
This has the same pattern as for a function of two variables, but with an 
extra term involving z. The same is true for all the other forms of the 
chain rule. For example, if x = x(t), y = y(t) and z = z(t), we have 

df of dx of dy of dz 
dt Ox dt Oy dt Oz dt’ 


The calculations get a bit longer, but are essentially the same. 


Oz. 


oy 4 


For a function f(x,y, z), the gradient vector is a vector in 
three-dimensional space, given by 
or OF Of, 
rad f = i+—j+— 
grad f = Dy? Oe 

where i, j and k are Cartesian unit vectors in three-dimensional space. At 
any given point, this gradient vector is perpendicular to the contour 
surfaces of f(x,y, z) at that point. The direction of grad f is the direction 
in which the function increases most rapidly, and the magnitude of grad f 
is the maximum rate of change of f with respect to the distance moved in 
three-dimensional space. 


Exercise 18 
(a) Calculate the gradient of the function f(x,y, z) = 2? + y? + 22?. 


(b) The surface of a solid object is given by the equation f(x,y, z) = 7, 
where f(x,y, z) is the function in part (a). Find a unit vector that is 
perpendicular to this surface at the point (1, 2,1). 


Exercise 19 

(a) Use the result of Example 3 to find the gradient of the function 
V(x,y,2) = (er +y? +2), 

(b) Describe in words the direction in which V increases most rapidly. 


(c) What is the magnitude of grad V at any point (xz, y, z) 4 (0,0,0)? 


3 Taylor polynomials 


This section is a sort of bridge. Our interest in Taylor polynomials arises 
mainly because they are needed in the final section of this unit, which 
deals with the stationary points of functions and their classification. The 
main point to grasp is the fact that close to a chosen point, functions can 
be approximated by polynomials. We start by revising the situation for 
functions of a single variable. 


3.1 Review for functions of one variable 


You know that a function of one variable can be approximated by a 
suitable polynomial. For example, Figure 18 shows that the function 

f(x) = sinz can be approximated near x = 0 by the simple polynomial 
p(x) = x. This approximation is a good one provided that we stay close 
enough to x = 0 (say within 0.1 of it). Similarly, Figure 19 shows that the 
function g(x) = cos x can be approximated near x = 7 by the second-order 
polynomial p(x) = —1+ $(a—7)?. 


How do we choose a suitable polynomial to use in any particular case? If 
we want to approximate the function f(a) near the point x = a, the secret 
is to choose a polynomial that matches f(x) in value, and in the values of 
its first few derivatives, at x = a. 


For example, let us compare the function f(x) = sinx with the polynomial 
p(x) = x at the point z = 0. The values match at « = 0 because 
f(0) = p(0) =0. The first derivatives are 


f'(z)=cosz and p(x) =1. 


These also match at x = 0 because f’(0) = p'(0) = 1. Finally, the second 
derivatives are 


f"(c)=-—sinzg and p"(r) =0, 
and these also match at « = 0 because f”(0) = p’(0) = 0. 


3 Taylor polynomials 


sin x 


Sv 


Figure 18 Near x = 0, sine 
(orange) is approximated by 
p(x) = x (blue) 


COs X 


=) 
Rv 


Figure 19 Near x = 7, cosz 
(orange) is approximated by 


p(z) = -1+ $(x — 2)? (blue) 
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YA 


—1 1 


Figure 20 A graph of e” 
against x (in orange) 
compared with Taylor 
polynomials for e” of orders 
1, 2 and 4 (blue) 
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Exercise 20 


Show that the function f(a) = cosz and the polynomial 
p(z) = —1+ 4(x— 7)? are matched in their values and in the values of 
their first, second and third derivatives at © = 7. 


The problem of finding a suitable polynomial has a general solution. The 
polynomial that matches f(a) in its values, and in the values of its first 
n derivatives, at the point x = a, is 


(a) = f(a) + #(a(e— a) + 5 F"(a\(e- a)? 


+ EfMalle a)? +--+ SFM (a)e ay", (33) 


where f(")(a) is the nth derivative of f(x) evaluated at 2 =a, and n! is 
factorial n given by n! = n(n — 1)(n — 2)...1. 


This polynomial is called the nth-order Taylor polynomial for f(z) 
about 7 = a. 


To take a definite case, consider the function f(x) = cosa of Exercise 20. 
In this case, successive differentiations give 


f(z)=cosa, f'(z)=-sinz, f"’(x)=—cosz, f(x) =sinz. 
So at the point « = 7, we have 
f(m=-1, f(m=0, f"@=1, f"()=0. 


Substituting these constants into equation (33) gives the third-order Taylor 
polynomial for cosa about the point x = 7: 

p(x) = -1+ 3(@—-7)’, 
and this is just what we used in Figure 19. 


Two points about Taylor polynomials are worth mentioning. First, for 
many functions, higher-order Taylor polynomials produce successively 
better approximations. This is illustrated in Figure 20. Second, as we 
approach the point x = a about which a given low-order Taylor polynomial 
is calculated, the polynomial becomes increasingly accurate. If we get 
extremely close to the point x = a, the function and its low-order Taylor 
polynomials become practically indistinguishable. 


Exercise 21 
(a) Find the first and second derivatives of the function f(t) = In(1 +??). 


(b) Write down the second-order Taylor polynomial for f(t) about the 
point ¢t = 0. 


(c) Write down the second-order Taylor polynomial for f(t) about the 
point ¢ = 1. 


3 Taylor polynomials 


3.2 Functions of two variables 


We now extend the concept of a Taylor polynomial to functions of two or 
more variables. First we must generalise the idea of a polynomial to cover 
more than one variable. An expression of the form 


p(t,y) =A+ Bat Cy, (34) 
where A, B and C are constants (with B and C not both equal to zero), is 
called a polynomial of order 1 in x and y. It is also called a linear Polynomials in more than one 
polynomial in 7 and Yy. variable are also called 
multinomials. 


An expression of the form 
p(x,y) = A+ Ba+Cy+ Da? + Exy + Fy’, (35) 


where A, B, C, D, E and F are constants (with D, E and F not all equal 
to zero), is called a polynomial of order 2. It is also called a quadratic 
polynomial in x and y. 


Notice that each term in these polynomials is of the form constant x xy", 
where n and m are zero or positive integers. A term in a polynomial is 
said to be of order N ifn+m=N. So linear polynomials contain terms 
up to order 1, and quadratic polynomials contain terms up to order 2. 


Given a function f(x,y), we would like to find a suitable polynomial in x 
and y that approximates the function in the vicinity of a point (a,b). The 
key is to choose the coefficients in the polynomial in such a way that the 
function and the polynomial match in their values, and in the values of 
their first n partial derivatives, at the point (a,b). This is just what an 
nth-order Taylor polynomial does. In practice, we need Taylor polynomials 
of only first and second order, so we will concentrate on these. 


We will first write down a general expression for the first-order Taylor 
polynomial for a function f(x,y) about a point (a,b). Then we will check 
that it has the desired properties. This process will then be repeated for 
the second-order Taylor polynomial. 


First-order Taylor polynomial 


The first-order Taylor polynomial for f(x,y) about (a, ) is 
pi(z,y) = f(a,b) + fe(a, b)(x — a) + fy(a,b)(y — 9). (36) 


We can easily check that this polynomial does what is needed. Setting 

x =a and y = 0 in equation (36), we see that pi(a,b) = f(a,b), so the 
function and the polynomial have the same value at the point (a, b). 
Moreover, a and b have fixed values, so f(a,b), fr(a,b) and fy(a,b) are all 
constants. Partial differentiation of p(x, y) with respect to x and y then 
gives 


gd fs) 
>— = fe(a,b) = Da and a fy(a,b) = - 


x=a, y=b z=a, y=b 
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In particular, the values of Op; /Ox and Op, /Oy match those of 0f/Ox and 
Of /Oy at (a,b), so the function and the polynomial p;(z,y) are matched in 
their values, and in the values of their first-order partial derivatives, at the 
point (a,b). This is what is required of a first-order Taylor polynomial. 


The second-order Taylor polynomial for a function involves its 
second-order partial derivatives. 


Second-order Taylor polynomial 


Be careful to get the second line The second-order Taylor polynomial for f(x,y) about (a,b) is 
right: common errors are to not 

multiply all three terms by 4, or po(z,y) = f(a,b) + fr(a,b)(x — a) + fy(a, b)(y — b) 

to omit the factor 2 in the ah 5 (fon (a, b)(x — a)” + 2 fry (a, b)(@ — a)(y — 6) 


middle term. 


+ fyy(a, b)(y — b)?). (37) 


It is straightforward to check that the value of this polynomial, and the 
values of all its first- and second-order derivatives, match those of f(x, y) 
at (a,b). This is done in the following exercise. 


Exercise 22 
Writing h(x, y) = pe(x, y), use equation (37) to show that 
h(a, b) = f(a, 6), 
hz(a,b) = fr(a,b), hy(a,b) = fy(a, 6), 
Are(a,b) = frx(a,b), Ryy(a,b) = fyy(a,6),  Rey(a,b) = foy(a, 6). 


(There is no need to check that hy, (a,b) = fyr(a,b) because this follows 
from hzy(a,b) = fry(a,b), thanks to the mixed partial derivative theorem.) 


Example 9 


Determine the Taylor polynomials of orders 1 and 2 about the point (2, 1) 
for the function f(x,y) = x? + ry — 2y?. 


Solution 


Differentiating the function partially with respect to x and partially with 
respect to y gives 


fo(x,y) = 327 + y, fy(z,y) = x2 — Ay. 
Differentiating partially again gives 

fen (x,y) = 6x, fey (x,y) = 1, Fuy(z,Y) = =A, 
It follows that 

f(2, 1) = 8, is(2; 1) = 13, Fy (2, 1) = -2, 

fox(2, 1) = 12, fey (2, 1) =1, fyy(2, 1) =—4 
so the first-order Taylor polynomial is 


pi(z,y) = 8+ 138(z — 2) — 2A(y— 1), 
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and the second-order Taylor polynomial is 

p(x, y) = 8 + 13(a — 2) — Ay — 1) 
+ $(12(a — 2)? + 2(a — 2)(y — 1) — 4(y — 1)”) 
= 8+ 13(z — 2) — 2(y — 1) 

+ 6(a — 2)? + (w — 2)(y — 1) — 2(y — 1)". 


The answer can be left in this form as, for many purposes, there is no 
advantage to be gained in collecting terms in powers of x and y. 


Exercise 23 


Given f(x,y) = 27e°, find the first- and second-order Taylor polynomials 
for f(x,y) about the point (2,0). 


The tangent plane 


You saw in Subsection 1.3 that a function f(x,y) can be plotted as a 
surface z = f(x,y) in three-dimensional space. The first-order Taylor 
polynomial for this function is 


pi(x,y) = f(a, 6) + Fela, b)(x = a) + fy(a, b)(y _— b), 


and this can also be plotted as a surface 


z= pi(z,y). 
Comparing with equation (3) and the discussion following it, we see that 
the surface z = p(x, y) is a plane. It is called the tangent plane of the 
surface z = f(z, y). 


To see why this name is appropriate, first note that f(a,b) = pi(a,b), so 
the surface z = f(x,y) coincides with the tangent plane at the point (a,b). 
We also know that the first partial derivatives of f are the same as those of 
p1 at the point (a,b). So we must have 


[grad pile=a,y=b = fr(a,b) i+ fy(a,b)j = [grad fle=a, y=0- (38) 


The slope in any chosen direction in the ry-plane is the component of the 
gradient vector in that direction, so the tangent plane and the surface of 
the function have exactly the same slopes at (a,b) in all directions. This is 
why the tangent plane is so-called. It is a natural generalisation of the 
concept of a tangent line to a curve at a given point (see Figure 21). 


3.3. Matrix representation of Taylor polynomials 


Our main interest will be in first- and second-order Taylor polynomials for 
functions of two or more variables. For future use, it will be helpful to take 
another look at equations (36) and (37), and recast them in matrix form. 


3. Taylor polynomials 


SS 
Ws 
SIS 
SXSG'Av 
WSS 
WA 
SN 


Figure 21 The tangent 
plane of f(x,y) = 274+ y? at 
the point x = —-1, y= —-1 
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First note that the first-order Taylor polynomial about (a,b) can be 
written as 


pi(x,y) = f(a, b) + fo(a,b)(a — a) + fy(a,b)(y — 6) 
= f(a,b) + [fr(a,b)  fy(a,b)] é 7 (39) 


By using matrix multiplication on the right-hand side, you can check that 
this is equivalent to equation (36). 


It is convenient to introduce the column vector 
L—a 
n=[°-4). “0 


which represents the displacement of the point (x,y) from (a,b) and is 
called the displacement vector. We also define 


fc(a, ‘4 
G= ‘ 41 

Fite va 
which is a matrix representation of the gradient vector. 


In terms of these matrices, the first-order Taylor polynomial can be 
written in the compact form 


PA (xo y) = $a; b) + G’R, (42) 
where the transpose converts the column matrix G into the row matrix 
needed in equation (39). 


The second-order Taylor polynomial can also be written in matrix form. 
We introduce the matrix of second-order partial derivatives 


1 ee a fey(a, | 
Ffyx(a,0) — fyy(a,b) 
Then, using matrix multiplication, you can check that 
T ca fees = fx (a, 6) Sey (a, 4 i _ 4 

pe eae ee aay feed] Lyab 

a _ py | Fea (a, b)(@ — a) + fry(a, 6)(y — ‘| 
=f vO ee) ley =) 

= fox (a, b)(a — a)” + 2foy(a, b)(a — a)(y — b) + fyy(a,b)(y — b)?, (43) 
where we have used the mixed partial derivative theorem in the last step. 


Comparing with equation (37), we see that the second-order Taylor 
polynomial can be written as follows. 


Second-order Taylor polynomial in matrix form 
po(x,y) = f(a,b)+G?R+4R7HR. (44) 


A compact formula is all well and good — but you might think that we 

would need to expand it out to use it. However, in the next section you 
will see that matrix methods can be applied directly to equation (44) to 
obtain useful results. In particular, the matrix H will become a focus of 


4 Minima, maxima and saddle points 


attention. This matrix is called the Hessian matrix after the German 
mathematician Otto Hesse (1811-1874). 


Equation (44) is also useful if we need a second-order Taylor polynomial 
for a function of three (or more) variables. This is because it remains true 
no matter how many variables the function contains! The only thing that 
changes is the size of the matrices R, G and H. For a function f(z, y, z) of 
three variables, expanded about the point (a, b,c), 


f(a, b,c) L—a 
G= | fy(a,0;¢) and R= |y-—6)b], 
Fz(a; 6; ) z—c 


while the Hessian matrix of second-order partial derivatives becomes 


Sea (a, b, c) Troy Os b, c) Sez (a, b, c) 
H = ya (a, b, c) fyy(a, b, c) Fyz(a, b, c) 
Sex (a, b, c) Fey, b, c) Sez (a, b, c) 


You need not bother to multiply out these matrices because when we use 
equation (44) in the next section, we will need only some very general 
properties of its constituent matrices. 


4 Minima, maxima and saddle points 


A very useful aspect of calculus is that it gives us a way of finding the 
maximum or minimum values of a function, and it lets us find the 
conditions under which these maxima or minima are attained. In this final 
section you will see how maxima, minima and other stationary points are 
found and classified for functions of two or more variables. 


Maxima and minima in the natural world 


In the natural world, many phenomena are governed by maxima or 
minima. For example, a mechanical system reaches a condition of 
stable equilibrium when a quantity known as the potential energy 
reaches its minimum value: a chain suspended between two fixed 
points will hang in such a way that its potential energy is as small as Figure 22. A chain hangs in 
possible (Figure 22). More complicated systems, that can exchange such a way that its potential 
heat with their surroundings, reach a state of thermal and mechanical energy has a minimum value 
equilibrium when a quantity called the free energy is minimised. The 
folding of protein molecules, which determines their biological 
function, is dictated by the configurations of locally minimum free 
energy (Figure 23). Even systems in motion, such as planets orbiting 
the Sun, move in such a way that a quantity called the action is 
minimised. No wonder the great Euler, seeking an ultimate 
explanation, speculated: 


For since the fabric of the universe is most perfect and the work ees 


of a most wise Creator, nothing at all takes place in the universe Figure 23 A protein 


in which some rule of maximum or minimum does not appear. molecule folds in such a way 
that its free energy has a 


local minimum value 


of 
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Figure 24 A function 
y = f(x) with local minima 
and maxima 


Figure 25 A function 
y = f(x) with a horizontal 


point of inflection at x = 2 
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4.1 Review for functions of one variable 


To put our discussion in context, we will briefly remind you of the 
situation for functions of one variable. The main point is that functions 
can have local minima and local maxima. 


If we are given a function f(a) and a point x = a inside its domain: 


e alocal minimum occurs at © = a if there is a small region around a 
within which f(x) > f(a) at all points c 4a 

e alocal maximum occurs at x = a if there is a small region around a 
within which f(x) < f(a) at all points « 4a 


e there is an extremum at x = a if this point is either a local 
maximum or a local minimum. 


For example, the function in Figure 24 has local minima at « = 0.3 and 
x = 1.8, and a local maximum at x = 1.3. All these points are extrema. 


There may be several local minima and maxima, so a particular local 
minimum need not give the smallest possible value of a function (the 
global minimum). Similarly, a particular local maximum need not give 
the largest possible value of a function (the global maximum). The 
global minimum and global maximum can be found by sifting through all 
the local minima and maxima. For example, the global minimum of the 
function in Figure 24 in the region 0 < x < 2 is at x = 0.3. We will not 
discuss this point further, but concentrate on the task of finding the local 
minima and maxima. 


We restrict attention to smooth functions f(x), and ignore any minima or 
maxima that might occur on the boundaries of the domain of f(x). Then 
calculus can help us to find the local minima and maxima. This is because 
the tangent line to the graph of f(x) against x is horizontal at a local 
minimum or maximum. Equivalently, df/dxz = 0 at such a point. Any 
point at which the first derivative df /dx is equal to zero is called a 
stationary point. So the extrema (i.e. the local minima and local 
maxima) are stationary points. However, we can also have stationary 
points that are not extrema. 


In Figure 25, the stationary point at x = 2 is neither a local maximum nor 
a local minimum. In the immediate vicinity of x = 2, points with x < 2 
have smaller values than f(2), while points with x > 2 have larger values 
than f(2). Such a point is called a horizontal point of inflection. 


We often need to classify the stationary points — that is, decide whether 
they are local minima, local maxima or points of inflection. To classify a 
stationary point at x =a, it is helpful to use the second-order Taylor 
polynomial for f(a) about a: 


pola) = f(a) + F(a)(a— a) + 5 F"(a)(@ —a)?. 


Since a is a stationary point, we know (by definition) that f’(a) = 0, so 


f(x) & po(x) = f(a) + Sf"(a)(ax —a)? (for x close to a). (45) 


4 Minima, maxima and saddle points 


Now, if we consider a small enough region around x = a, then the 
second-order Taylor polynomial p2(a) will approximate f(a) extremely 
well — so well, that we can replace the above ~ sign by an equals sign with 
negligible error. Then looking at equation (45), we see that the condition 
f” (a) > 0 guarantees that f(x) > f(a) for all points x # a. This is 
precisely the definition of a local minimum. A similar argument, but with 
f” (a) < 0, leads to a local maximum. We therefore have the following test. 


Second derivative test for stationary points 
Given a function f(a), we can say that: 


e when f’(a) =0 and f”(a) > 0, we have a local minimum at 
a 

e when f’(a) = 0 and f”(a) < 0, we have a local maximum at 
Op 


The test works unless f(a) = 0, in which case the test gives us no 
information; we would need to look at higher-order Taylor polynomials to 
make a decision. Also note that the second derivative test never identifies 
a horizontal point of inflection. This is not a great problem because in 
practical cases the main interest lies in minima and maxima. 


4.2 Stationary points for functions of two variables 


We will now try to extend the above ideas to a smooth function f(z, y) of 
two variables. The definitions of local minima and local maxima are 
essentially the same as before, but take account of the fact that points and 
regions are now specified by two coordinates. 


Local minima and maxima 


Given a function f(x,y) and a point (a,b) inside the domain of f: Here, ‘inside’ implies not on any 


ini : : 0 b d line. 
e alocal minimum occurs at (a,b) if there is a small region Poy sa 


around (a,b) within which f(x,y) > f(a,b) at all points 
(x,y) F (a, 6) 

e alocal maximum occurs at (a,b) if there is a small region 
around (a,b) within which f(x,y) < f(a,b) at all points 
(x,y) # (a, 0). 


As before, an extremum is a point that is either a local minimum or a 
local maximum. For a smooth function f(x,y), the tangent plane to the 
surface z = f(x,y) is horizontal at any extremum. This means that the 
slope of the tangent plane is equal to zero for any direction in the xy-plane, 
so both the partial derivatives f, and fy must be zero at an extremum. 
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Stationary points 


A point (a,b) is a stationary point of a function f(z, y) if both 
fr(a,b) and fy(a,b) are equal to zero. 


All extrema (i.e. all local minima and maxima) are stationary points, but 
some stationary points are not extrema. Figure 26 shows the three cases 
that can occur. In all three cases, there is a stationary point at (0,0) 
because both the partial derivatives f, and fy are equal to zero there. 


In Figure 26(a), we have a local minimum because all paths moving 
smoothly away from (0,0) initially climb upwards to higher function 
values. In Figure 26(b), we have a local maximum because all paths 
moving smoothly away from (0,0) initially descend downwards to lower 
function values. But in Figure 26(c), we have something different: some 
paths moving away from (0,0) climb upwards, and others descend 
downwards. This stationary point is neither a local minimum nor a local 
maximum. It is called a saddle point because the shape of the surface is 
rather like the shape of a saddle placed on a horse’s back. 


Saddle points 


A saddle point is a stationary point that is neither a local minimum 
nor a local maximum. Through such a point, some paths climb to 
higher function values, while others descend to lower function values. 
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Figure 26 Functions with stationary points at (0,0): (a) a local 
minimum; (b) a local maximum; (c) a saddle point 


In the next subsection you will see how stationary points can be classified 
using second-order partial derivatives. For the moment, we concentrate on 
locating the stationary points using first-order partial derivatives. 


Example 10 


Locate the stationary point(s) of the function 


f(a,y) =2? +y? + (x@—1)(y +2). 
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Solution 

Partially differentiating with respect to x gives 
fe(v,y) = 2a +y +2, 

and partially differentiating with respect to y gives 
fy(z,y) = 2y+e—-1. 


At a stationary point, f, = fy = 0, so we need to solve the pair of 
simultaneous equations 


22+ y=-2, 
x+2y=1. 


These equations have the solution x = —5/3, y = 4/3, so the only 
stationary point is at (x,y) = (—5/3, 4/3). 


A note of caution! We can always carry out the partial differentiations 
needed to obtain the simultaneous equations f, = 0, fy = 0. In this unit, 
such equations can always be solved by hand. More generally, however, the 
equations may be too complicated for this — they would then be solved 
numerically on a computer. 


Exercise 24 


Locate the stationary point(s) of the function 


f(a, y) = 3a? — dary + Qy? + 4a — By. 


Exercise 25 


Locate the stationary point(s) of the function 


f(x,y) = xy(z + y — 3). 


4.3. The eigenvalue test 


There remains the task of classifying the stationary points — deciding 
which are local maxima, which are local minima, and which are saddle 
points. In broad terms, our tactics will be the same as for functions of a 
single variable: we will use the second-order Taylor polynomial. This 
method works for functions of any number of variables, but we will 
initially consider a function f(x,y) of two variables. 


Suppose that f(x,y) has a stationary point at (a,b), and that po(z, y) is 
the second-order Taylor polynomial for f(x,y) about (a,b). The full 
expression for this polynomial is given in equation (37), but there are great 
advantages in using the matrix version given in equation (44). 
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This matrix multiplication was 
carried out in equation (43). 
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This is 
p2(a,y) = f(a,b) +G'R+5R’HR, (46) 
where 


= Hen and R= Fa ] 


are the gradient at (a,b) and the displacement away from (a,b), written as 
column vectors, and H is the Hessian matrix of second derivatives at (a,b): 


_ |frx(a,b)  fay(a, 6) 
oa ae fale 


Because (a,b) is a stationary point, both f,(a,b) and f,(a,b) are equal to 
zero. The gradient matrix G therefore has both elements equal to zero, 
and equation (46) reduces to 


p2(x,y) = f(a,b) + 5R7HR. 


In the immediate vicinity of (a,b), any small difference between f(x,y) and 
po(x,y) becomes negligible, and we can safely replace p2(x,y) on the 
left-hand side by f(x,y). Rearranging slightly, we can write 


f(a, y) _ f(a, b) _ tR’H R, (47) 
provided that (x,y) is close enough to the stationary point at (a,b). 


Multiplying out the matrix product on the right-hand side of 
equation (47), it is easy to see that 


RTHR = fre(a,b)(z — a)? + fyy(a,b)(y — b)? 


+ 2fay(a, b)(x — a)(y — 8). (48) 
Denoting the right-hand side of this equation by Q(x,y), we can write 
R’HR = Q(z, y), (49) 


where Q(x, y) is a quadratic function of x and y with coefficients that 
depend on the second-order partial derivatives of f(x,y) at the stationary 
point (a,b). 


Now, recall how local minima, local maxima and saddle points are defined. 


For a local minimum, the function values all around the stationary point 
are greater than at the stationary point itself. This means that 

f(x,y) > f(a,6) for all (x,y) that are sufficiently close, but not equal, to 
(a,b). Using equations (47) and (49), we see that this condition is 
guaranteed if Q(x, y) > 0 for all (x,y) 4 (a,b). 


For a local maximum, the function values all around the stationary point 
are smaller than at the stationary point itself. This is guaranteed if 


Q(x, y) <0 for all (x, y) 4 (a,b). 


Saddle points correspond to increases in some directions and decreases in 
others, so the sign of Q(z, y) must depend on x and y in this case. We 
therefore have the following criterion. 


4 Minima, maxima and saddle points 


The nature of a stationary point (a,b) depends on Q(x,y) = RTHR. 


e =6If Q(z,y) > 0 for all (x, y) 4 (a,b), then we have a local 
minimum. 


e If Q(z,y) < 0 for all (x,y) (a,b), then we have a local 
maximum. 


e = If Q(z, y) is positive for some (x,y) and negative for other (x, y), 
then we have a saddle point. 


Given any function f(z, y) with a stationary point at (a,b), we can try to 
classify this stationary point by examining the sign of Q(z, y) for all x and 
y around (a,b). Let us take a very simple example. Suppose that 


F(z,y) = (@- 1? +(y- 1). 
Then 
2=2(¢—1) and fy=2(y-1), 


so there is a stationary point at (1,1). The second-order partial derivatives 
are 


a=, fyy = 2, fey = fyx = 9, 


which are constants in this case. Hence equation (48) gives 


Q(x,y) = 2(@ — 1)? + Ay — 1)?. 
This the sum of two squared terms, neither of which can be negative, so 
Q(ax,y) > 0 for all (x,y) 4 (1,1). We therefore conclude that the 
stationary point (1,1) is a local minimum. This is hardly surprising, but 
illustrates the logic of our method. 


Using eigenvalues 


In a more general case, the sign of Q may not be so obvious. Fortunately, 
there is a systematic way to proceed, using the eigenvalues of the Hessian 
matrix H. The method hinges on the fact that H is a real symmetric 
matrix. It is real because f, x and y are all real-valued, so frx, fyy, fry 
and fy, are all real. And it is symmetric because the mixed partial 
derivative theorem ensures that fry = fyz. 


You know from Unit 5 that real symmetric matrices have some special 
properties. Their eigenvalues are always real, and their eigenvectors can 
always be chosen to be real, of unit magnitude and mutually orthogonal. 
For the 2 x 2 Hessian matrix considered here, there are two real eigenvalues 
Ai and Ag, corresponding to two real orthogonal eigenvectors vj and v2, so 


Hv; =Aivi and Hvs = dove. (50) 


The displacement vector R can be written as a linear combination of the 
two real orthogonal eigenvectors: 


R= av, + Pvo, R, a and £ are functions of x 


where the components a and £ are real (because R, v; and v2 are real). and 
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Applying H to R, and using the eigenvalue equations (50), we get 

HR = Hav, + Bv2) = a(Hvi) + B(Hve) = ar1v1 + Prove. 
We also have 

R’ = avy + Bvp, 
So 

Q(z, y) = RTHR = (avi + Bv3)(aA1v1 + BA2v2). (51) 
Finally, we use the fact that v; and v2 are of unit magnitude and mutually 
orthogonal. In matrix terms, this means that 

vivi =: V3 V2 =. vi v2 = 0, vivi = 0. 
So multiplying out the brackets in equation (51) gives 

Q(x, y) = a7Ar + Ba, (52) 


where a@ and £ vary over a range of real values as x and y vary. Away from 
the stationary point, the displacement vector R is non-zero, which means 
that the components @ and { cannot simultaneously be equal to zero. You 
have already seen that the nature of the stationary point is determined by 
the sign of Q(x, y). Now we see that this is determined by the signs of the 
eigenvalues A; and Ag of H. 


e When both eigenvalues are positive, Q(x,y) > 0, and the stationary 
point is a local minimum. 


e When both eigenvalues are negative, Q(x, y) < 0, and the stationary 
point is a local minimum. 


e When the eigenvalues have opposite signs, Q(x, y) may be positive or 
negative, and the stationary point is a saddle point. 


This leads to the following procedure. 


Procedure 1 The eigenvalue test 


Suppose that we are given a smooth function f(x,y) with a 
stationary point at (a,b). To establish the nature of the stationary 
point, do the following. 


1. Find the second-order partial derivatives, and evaluate them at 
the stationary point. 


2. Construct the Hessian matrix H at the stationary point, and 
determine its eigenvalues. 


3. Apply the following rules: 
e = If all the eigenvalues are positive, we have a local minimum. 
e = If all the eigenvalues are negative, we have a local maximum. 


e If the eigenvalues have mixed signs, we have a saddle point. 


4 Minima, maxima and saddle points 


This procedure does not cover the case where the eigenvalues do not have 
mixed signs but include a zero; in this case, the test is inconclusive. 


Example 11 


Locate the stationary point(s) of the function f(x,y) = e~@ +), and use 
the eigenvalue test to classify them. 


Solution 
Partial differentiation gives 
fr = —22e7@"+¥") and fy= dye (@? +97), 


Since f, = 0 only when x = 0, and fy = 0 only when y = 0, there is a 
single stationary point, at (0,0). 


The second-order partial derivatives are 


dee — ~9p-("44") + 4x ety?) 

— _9,—(#?+y”) —(#? +y?) 
Suy 2e + 4y*e ; 
Foy = Aye 4"), 


At the stationary point (0,0), we have fr2(0,0) = —2, fyy(0,0) = —2 and 
fry(0,0) = 0. So the Hessian matrix at (0,0) is 


Be aco 


The eigenvalues satisfy 


—2—-A 0 


0 = det (H- At) =| 0 635 


[= 2+), 


so they are Ay = —2 and Az = —2. These are both negative, so the 
stationary point (0,0) is a local maximum. 


Exercise 26 
Locate and classify the stationary point of the function 


f(x,y) = 2x? — xy — 3y? — 32 4 Ty. 


A minor shortcut is available for functions of two variables. You may recall 
from Unit 5 that the eigenvalues A, and Az of a 2 x 2 matrix A have the 
following properties: 


e Their product is equal to the determinant of A. 
e Their sum is equal to the trace of A. 
In the case of the Hessian matrix, we have 

Ai Ag = det H, 

Ay + Ag = trH. 


For a diagonal matrix, the 
eigenvalues are equal to the 
diagonal matrix elements. 
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We could equally well use 
Fyy(a, 6) instead of fra(a, 6) in 
this test. 
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From the first of these equations we see that the condition det H < 0 
corresponds to eigenvalues of opposite signs, and therefore to a saddle 
point. The condition det H > 0 corresponds to eigenvalues of the same 
sign, and therefore to an extremum — a local minimum if both eigenvalues 
are positive, and a local maximum if they are both negative. 


In order to distinguish between these two types of extremum, we can 
consider tr H, which is positive for a local minimum and negative for a 
local maximum. We can find the sign of trH from its definition 


tr = frx( a,b) + fyy(a, 6). 
However, there is an even simpler test. Because 
det H = tex (a, b) yy (a, b) ~~ ACs b), 


we see that frz(a,b) and fy,(a,b) must have the same sign if det H > 0. It 
follows that a local minimum is characterised by det H > 0 and 

fex(a,b) > 0, and a local maximum is characterised by det H > 0 and 
fex(a,b) < 0. The following box summarises the situation. 


Determinant test for functions of two variables 


Suppose that f(x,y) is a smooth function, with a stationary point at 
(a,b) and a corresponding Hessian matrix H. Then the stationary 
point is: 

e a local minimum if det H > 0 and frz(a,b) > 0 

e alocal maximum if det H > 0 and frx(a,b) < 0 

e asaddle point if detH < 0. 


The test is inconclusive if det H = 0. 


Exercise 27 


Check that the determinant test reproduces the results derived in 
Example 11 and Exercise 26. 


The determinant test saves a small amount of time because it avoids the 
need to find the eigenvalues of the Hessian matrix, but it is really an aside 
to our main discussion. This is because it is restricted to functions of two 
variables. By contrast, the eigenvalue test can be extended to functions 
with any number of variables. We will briefly describe how this extension 
works. 


The definitions of stationary point, extremum, local minimum, local 
maximum and saddle point can all be extended in a natural way. For 
example, any stationary point is identified by the fact that all of its 
first-order partial derivatives are equal to zero. So for a function f(z, y, z) 
of three variables, (a,b,c) is a stationary point if and only if 


Fa(a, b,c) = fy(a, b, c) = f(a, b, c) = 0. 


4 Minima, maxima and saddle points 


This point is a local minimum if f(x,y, z) is greater than f(a, b,c) at all 
points (x,y, z) 4 (a,b,c) in the immediate vicinity of (a,b,c), and so on. 


The arguments leading to the eigenvalue test are also similar to those 
given earlier. The only difference is that the matrices G, H and R all grow 
in dimension as the number of variables in the function increases. For a 
function of n variables, the Hessian matrix is an n x n real symmetric 
matrix. This has n real eigenvectors Ay, A2,..., An, corresponding to n real 
orthogonal eigenvectors. So equation (52) is replaced by 


Q(a,y) = Ar + Ae t+ + An, 


where c1,C2,...,€p are all real. Just as before, this leads directly to 
Procedure 1, which is already phrased in such a way that it applies to any 
number of eigenvalues. Here is an example for a function of three variables. 


Example 12 
The function 
f(a,y,z) =a? +y? 4+ 27 4 82y — Qyz 
has a stationary point at (0,0,0). Use the eigenvalue test to classify it. 
Solution 
The first-order partial derivatives of f(x,y, z) are 
fe = 22+ 3y, fy=2y+3x—2z, f, = 2z—2y, 
and the second-order partial derivatives are 
fea = 2, fyy=2, faz = 2, 
fey = fye =3, fee = fea =9,  fyz = fey = —2. 


These are constants, so the Hessian matrix at (0,0,0) is 


2 3 =O 
H=/3 2 -2 
0 -2 2 


The eigenvalues of H satisfy the characteristic equation 
2—-r 3 0 
O=/| 3 2-A -2 
0 —2 2- 

= (2—A)((2— A)? — 4)) — 3(3)(2— A) 

= (2—)(A? — 4-9). 
So one of the eigenvalues is 2, and the other two eigenvalues are given by 
the solutions of \7 — 4\ — 9 = 0, ice. 


4+ 1 


So the eigenvalues are 2, 2+ 13 and 2— 13. There are both positive 
and negative eigenvalues, so the stationary point is a saddle point. 


A 
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Exercise 28 
Find the stationary point of the function 
f(a,y,2) = 3(a? +y?) + 2ey + 2, 


and use the eigenvalue test to classify it. 


Exercise 29 


The following functions all have stationary points at (0,0,0). Where 
possible, use the eigenvalue test to classify these stationary points. 


(a) f(v,y,z) = 2? + 2y? + 32? 
(b) f(x,y, z) = 2? + 2y? — 32? 
(c) f(x,y, z) = 2? + 2y? + 324 
(d) f(x,y, z) = 2? + 2y? — 324 


4.4 Constrained extrema 


This final subsection is optional. It is included for interest and 


because it may be useful if you read other texts. It will not be 
assessed or examined. 


So far, we have considered extrema in situations where the independent 
variables can vary freely. In real life, constraints must often be taken into 
account. For example, we might ask ‘what is the maximum volume of a 
rectangular box?’. This question is rather pointless as posed, because we 
can obviously make the volume as large as we choose by building a large 
enough box! By contrast, the question ‘what is the maximum volume of a 
rectangular box with a total surface area of 4 square metres?’ is much 
more interesting. In this example, the quantity that we are maximising 
(the volume of the box) is subject to a constraint (the fixed surface area of 
the box). We are to maximise the volume while keeping the surface area 
fixed. This is a new type of problem, which we now tackle. 


Constrained extrema in the real world 


The problem of finding extrema subject to constraints is important in 
economics, business administration and many other aspects of life. 
Very often, we need to maximise or minimise something — a company 
would like to maximise its profits, or a hospital would like to minimise 
its fatalities. But this must be achieved with known, fixed assets. 


4 Minima, maxima and saddle points 


Economists often represent the desire for a given commodity by a 
function called the utility function, and they assume that consumers 
maximise this function, subject to their budget constraints. The 
question that we face is: given a fixed set of assets, how can we 
distribute them to achieve a certain goal as fully as possible? 


A whole branch of physics (statistical mechanics) is based on similar 
ideas. In this case, the fixed asset is energy and we work out how to 
distribute the total energy among the various particles in a system in 
such a way as to maximise the probability of a particular distribution. 
It turns out that some distributions are overwhelmingly more likely 
than others, giving us the power to predict, with practical certainty, 
how complicated systems containing billions of billions of particles 
will behave. This works for gases containing 1074 molecules and 
galaxies containing 10!! stars, where it would be hopeless to try to 
predict the detailed motion of every particle. 


Suppose that we want to find the maxima or minima of a function f(z, y) 
subject to a constraint specified by the equation g(x,y) = c, where c is a 
constant. If we could solve the equation g(x,y) = c to obtain y as a 
function of x, we could substitute this into the function f to obtain 

f(x, y(a)). This depends only on the single variable x, so we could use the 
rules of ordinary calculus to find its maxima and minima. But what if we 
cannot solve the equation g(x,y) = c for y? 


The key to solving this problem is provided by Figure 27, which shows 
contour lines for the function f(x,y) (in orange) and the curve for the 
function g(x, y) = (in blue). In searching for a stationary point of f(x,y), 
we are obliged to travel along the blue curve in order to satisfy the 
constraint. At a point like A, the blue curve crosses contour lines of 
f(x,y), which indicates that f is changing, so A is not a stationary point. 
At point B, however, the blue curve is tangential to a contour line of 
f(x,y), and this means that f is not changing as we travel along the blue 
curve in the vicinity of B. We can therefore say that B is a stationary 
point of f(x,y) subject to the constraint g(x,y) = c. 


The distinguishing feature of point B is that there is a contour line of 
f(x,y) that is parallel to the curve g(x,y) = c at point B. The 
corresponding gradient vectors grad f and grad g are perpendicular to 
these curves, so they are also aligned (parallel or antiparallel). We can 
therefore write 


grad f = \grad4g, (53) 


where A is a non-zero constant whose value is at present unknown. Writing 
down the components of equation (53), together with the original 
constraint, gives three equations: 
Of dg _ 


an 0, 


5 (54) 


g(x,y) =. 


Problems where the constraint is 
specified by an inequality are 
also of interest, but will not be 
discussed here. 


Figure 27 


A situation in 
which we are constrained to 
move along the blue curve 
over terrain whose contour 
lines are shown in orange 


We cannot say that \ = 1, 
because grad f and grad g may 
have different magnitudes. 
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We require the solutions to be 
real, so imaginary solutions are 
rejected. 
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These are 3 equations for 3 unknowns (xz, y and X), so the problem is now 
solved, at least in principle, by eliminating the unknown constant and 
finding values for x and y. If we let 


L(x, y) = f(z, y) -_ Ag(x,y), 

we see that equations (54) can also be written as 
Og Ce 
Ox * Oy 


which leads to the following procedure. 


0, g(@,y) =, (55) 


Procedure 2. Finding stationary points with a constraint 


To find the stationary points of the function f(x,y) subject to the 
constraint g(x,y) = c, where c is a constant, do the following. 


1. Construct the function L(x, y) = f(x,y) — Ag(z,y), where A is an 
unknown constant. 


2. Partially differentiate L(x, y) with respect to x and y, and form 
the equations L, = 0 and Ly = 0. 


3. Find the stationary points of f(x,y) subject to the given 
constraint. 


The following example shows how this procedure is used. 


Example 13 


Find the stationary points of f(x,y) = x? + y? subject to the constraint 
xy = 4. What are the values of f at these stationary points? 


Solution 


We form the function L(a, y) = x7 + y? — Ary and calculate its first-order 
partial derivatives Lz = 2x — Ay and Ly = 2y — Ax. Combining the 
stationary point conditions L, = 0 and L, = 0 with the constraint 
equation gives 


2x4—Ay=0, 2y-—Ax=0, axy=4. 


Eliminating \, we get x? — y? = 0. Combining this with the constraint 
equation then gives x? — 16/x? = 0, or 24 = 16, which has real solutions 

x = +2. Putting these values back into the constraint equation xy = 4, we 
see that there are two stationary points: (2,2) and (—2,—2). In each case, 
the corresponding value of f is 2? + 2? = 8. 


Figure 28 shows a contour map of f(z,y) = x? + y”, with the constraint 
curve xy = 4 overlaid in blue (there is one branch for x > 0 and another 
branch for x < 0). You can see that the constraint curve meets the contour 
lines of f(x,y) tangentially at the stationary points (2,2) and (—2,—2), 
and that the function f has the value 8 at these points. This is the smallest 
value encountered on the curve ry = 4, so the stationary points are local 
minima when the constraint is satisfied. They are not the same as the 
local minimum (2, y) = (0,0) obtained in the absence of any constraints. 


The constant A is called a Lagrange multiplier, after Joseph-Louis 
Lagrange (1736-1813), who devised the method that we have just 
described. This method works for functions of three or more variables in a 
very similar way, but it is not always easy to classify the stationary points. 
Applying the eigenvalue test to the function L(x, y), for example, does not 
necessarily tell us the nature of the stationary points of f(x,y) in the 
presence of a constraint. If necessary, a graph such as that in Figure 28 
can be used to distinguish between the various possibilities. 


Exercise 30 


Find the stationary points of f(x,y) = 5a — 3y subject to the constraint 
2,2 
w—y=l. 


Learning outcomes 


After studying this unit, you should be able to do the following. 


e Interpret surfaces, section functions and contour maps used to 
represent functions of two variables. 


e Calculate first- and second-order partial derivatives of a function of 
several variables. 


e Use various versions of the chain rule. 


e Calculate the first- and second-order Taylor polynomials for a function 
of two variables. 


e Locate the stationary points of a function of two (or more) variables 
by solving a system of two (or more) simultaneous equations. 


e Classify the stationary points of a function of two (or more) variables 
using the eigenvalue or determinant test. 


Learning outcomes 
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Figure 28 Contour map for 
f(z,y) =2? 4+ y’, with 
contour lines (orange) and 
constraint curve ry = 4 
(blue) 
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Unit 7 Functions of several variables 
Solutions to exercises 


Solution to Exercise 1 


(a) f(2,3)=3 x 2? -2x 3? =—6. 
(b) f(3,-2) =3 x 37-2 x (—2)? =19. 
(c) f(a,b) =3 x a* —2 x b? = 3a? — 207. 
(d) f(6,a) =3 x 6? — 2 x a? = 30? = 2o?. 
(e) f(2a,b) =3 x (2a)? — 2 x b? = 12a? — 207. 
(f) f(a—b,0) =3 x (a— 6b)? —2x 07 = 3(a — b)?. 
(2) f(a, 2) =3 “a? - 2% 2? = 32? = 8. 
Solution to Exercise 2 
We have 
Og 2 Og ; 
36 =cos@+cos¢@sec’ @ and a6 = —sin¢tané. 


Solution to Exercise 3 


(a) Partially differentiating with respect to x, with y held constant, and 
using the product rule for differentiation, we get 


fee(x, y) = 2xe* + 3(x? + y*)e2* = (3a? + 2x + 3y")e?*. 
Partially differentiating with respect to y, with x held constant, gives 
fy(z,y) = 2ye**. 


(b) At the point (0,1), the slope of the surface in the x-direction is given 
by 


fr(0,1) = (8 x 07 +2043 x 17)e® =3. 
Solution to Exercise 4 
(a) Holding y and z constant, 


Of s. 


2t1 : 
Aa (1+ 2) 
Similarly, 
of 2 Of 3 
= Si a . 
Dy 3(1+y)° and De A(1+ z) 
(b) Holding y and t constant, 
of = 2ry>t* + 8art? — Qy. 
Ox 
Holding x and ¢ constant, 
of 


— = 39 y tf = la +1, 
Oy 
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Finally, holding x and y constant, 


of 


= 477 yt? + 8x7. 
OE y + 8x 


Solution to Exercise 5 


Differentiating f(x,y) = xsiny with respect to x and treating y asa 
constant gives 


7) ; 
fo(0,y) = —(wsiny) = siny. 
Differentiating f(x,y) with respect to y and treating x as a constant gives 
O : 
x,y) = —(axsiny) = xcosy. 
fy(x,y) By | y) y 
Differentiating f, with respect to x gives 
O.. 
while differentiating f, with respect to y gives 
a 
fya(x,y) = By one) = COSY. 
Differentiating fy with respect to y gives 
7) . 
Fyy(@y) = ay cosy) = —rsiny, 
while differentiating f, with respect to x gives 


0 
Fey(t-y) = (0 c0s y) = 08 y 


Evaluating these four second-order partial derivatives at (2,7), we get 


fea(2,7) = 0, 

fux(230) = cose = =1, 
fyl2sm) =—2sna =0 
if) Sse = 1 


Solution to Exercise 6 
The first-order partial derivatives are 


Of _ 2z+3y Of a 20-+3y 
An 2e and a 3e . 


Partially differentiating the first of these functions with respect to y and 
the second with respect to x, we get 


Of _ 22+3y 

Foun, y) = iv(ar) = be ’ 

Of _ 27+3y 

ify) = an (3 ) = 6e : 
At the point (x,y) = (0,0), we have fzy(0,0) = fyx (0,0) = 6. 


Solutions to exercises 
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Solution to Exercise 7 


The first-order partial derivatives are 


ar af 


an —3sin(3z —2t) and a 2sin(3x — 2t). 
The required second-order partial derivatives are then 

a O 

a = <(—Bsin(3x — 24) = —9 cos(3z ~ 24), 

a 6) 

ae = a2 sin(3a2 — 2t)) = —4cos(3xz — 2). 
Comparing these two expressions, we see that 

oy .voe : 9 

Ae? 4 OE” or equivalently, fre = 7 fu- 


Solution to Exercise 8 


Using the quotient rule, the first-order partial derivatives are 


_(tyyy-zy_ 
ileyy) = (x+y)? (c+ y)2’ 

_ (@+ yea ay <* 
FAO) = (x+y)? (x + y)2’ 


At the initial point (x,y) = (1,4), we have f,(1,4) = 16/25 and 
fy(1, 4) = 1/25. We also have 6x = 0.01 and dy = —0.01. The chain rule 
then gives 


1 1 
Of = fe(1,4) da + fy (1,4) by = = x 0.01 + 55 x (—0.01) = 0.006. 


Solution to Exercise 9 


Carrying out the necessary differentiations, we obtain 


ve = COs Oe = 3sin 
Ox ras z, Oy a) Y; 
dx dy 

— = 9f. == ?: 

dt > dt 


The chain rule then gives 
dz  Ozdx Oz dy 


a Or de Oy ae = 2tcos x + 6 sin y = 2tcos(t”) + 6sin(2t). 


Solution to Exercise 10 


The required derivatives are 


Oz 

—=ycosz, —=sinz, 
Ox y 

dx eae 

— = — = 2t. 

di °’ dt 


Solutions to exercises 


So the chain rule gives 

d Oz d Oz d 

— = ~ = + — = e'ycosz + 2tsinz = t?e' cos(e") + 2tsin(e’). 
Att=0, dz/di=0. 


Solution to Exercise 11 


We have 

0 Of 

Ag 2 and By 

Also, 

Of _ Oy Ox Oy 
du” du av av 


The chain rule then gives 

Of Of Ox Of Oy 

Ou Ox Ou dy du 
= 2x x 24 2y x 3 = 4(2u + 3v) + 6(3u — 2v) = 26u 


and 
af _ af Ox, af dy 
Ov Ox Ov Oy Ov 
= 2¢ x 3+ 2y x (—2) = 6(2u + 3v) — 4(3u — 2v) = 26v. 
At the point (u,v) = (1,2), we get fy (1, 2) = 26 and f,(1,2) = 52. 


Solution to Exercise 12 


We have 

of Of 

— =72 — = -2y. 

Ox "Oy : 
Also, 

OG - Oy Ox ; Oy 

a= os 0, Ap 8D, 20 rsin 0, Ap 7 0088. 
So 


Of Of Ox Of Oy 
Or Ox Or Oy Or 
= 2x cos 6 — 2ysin@ = 2r cos” 6 — 2r sin? 6 = 2r cos(20) 
and 
Of Of Ox Of Oy 
00 «Ox 00" Ay 00 
= 2x x (—rsin 0) — 2y(r cos 0) = —4r? cos sin 0 = —2r? sin(20). 
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Solution to Exercise 13 


(a) When n points in the z-direction, nN; = 1 and ny = 0, so the slope is 
f(x,y). This agrees with our previous interpretation of Of /Ox as the 
slope in the x-direction. 


(b) When n points in the y-direction, 7, = 0 and 7, = 1, so the slope is 
fy(x,y). This agrees with our previous interpretation of Of /Oy as the 
slope in the y-direction. 

Solution to Exercise 14 

With f = ry’, the gradient is 

a) 0 
grad f = Oe lls = y*it 2ayj. 
Ox Oy 
At (1,2), this gradient has the value 27i+ (2 x 1 x 2)j=4i+4+ 4j. 


Solution to Exercise 15 


The partial derivatives are 


O 0 
oF = Ary? + 3y? and OF = Ag? y + Oxy”, 
Ox Oy 


so the gradient is 
grad f = (4ay” + 3y?)i+ (4a7y + 9axy?) j. 


The vector i—j is not a unit vector, but the corresponding unit vector is 
n= a i- wi. At any point (x,y), the slope in the direction of fi is 


1 1 


n- grad f = ate" + 3y?) a tey + Yay”) 
1 
= —(3y? — 4a7y — 5ary’). 
a! y y — dry”) 


Hence at (1,2), the slope in the direction of the vector i—j is 
(24 — 8 — 20)/\/2 = —2,\/2 = —2.83 (to three significant figures). 
Solution to Exercise 16 


Using equation (32), we see that the slope of the surface z = f(x,y) is 
most negative when cosa = —1, that is, when a = m7. So the direction of 
steepest decrease at (a,b) is opposite to the direction of grad f at (a,b). 


Solution to Exercise 17 


We first calculate 
) ) 
grad f = sei Sei = (Ary — 9x7) i+ 2x? j. 
At the point (1,2), this has the value 
[grad f]e=14.9 =—1-+ 2]. 


This vector has magnitude ,\/(—1)? + 22 = V5, so the unit vector in the 
direction of grad f at (1,2) is = (—i+ 2j)/V5. 


This is the direction of steepest increase of the function f(x,y). The bug 
should move in the opposite direction, which is the direction of steepest 
decrease of the toxicity function. This is along the unit vector (i — 2j)/V5. 


Solution to Exercise 18 
(a) The gradient is 


6) ) O 
grad f = OF fl i OF ee dey aia ie’ 
Ox Oy Oz 
(b) The surface of the object is a contour surface for the function 
f(x,y, z). At each point on this surface, the gradient vector is 


perpendicular to the surface. At (1,2,1) this gradient vector is 
[grad f\a.2,1) =?) i + 4j + 4 k, 


which has magnitude /4 + 16+ 16 =6. So a unit vector 
perpendicular to the surface at (1, 2,1) is 

Ae Bey BZ 
Solution to Exercise 19 


(a) Example 3 found the three partial derivatives OV/Ox, OV/Oy and 
OV /0z for the function V (a, y, z). Using these results, the gradient 
vector is 

vi+t+yjtzk 

(x2 + y2 + 22)8/2° 

(b) Because of the minus sign, grad V points in the opposite direction to 
the position vector r = xi+ yj+ zk, so we can say that at each point 
(x,y,z), the gradient vector grad V points towards the origin. This is 
the direction in which V increases most rapidly. 


grad V = — 


(c) The square of the magnitude of grad V is 
x? + y? + 2? 1 


dV; =. Ss 
|grad V| (x2 + y2 + 22)8 (x2 + y? + 22)?” 
so 
1 
|gradV| = age for (x,y,z) # (0,0, 0). 


Solution to Exercise 20 
We have 
f(z)=cosa, f'(r)=-—sinz, f"(z4)=—cosz, f’"(x) =sinza, 
and 
p(a)=-14+4(e@-7)%, pl(e)=a-m, pM(a)=1, pa) =0. 
) 


Hence f(m) = p(x) = —1, f'(7) =p'(m) =0, f" (a) =p" 
f'" (x) = p(x) = 0, as required. 


Solutions to exercises 
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Solution to Exercise 21 


(a) Differentiating once, and then again, gives 


Qt 
/ 5 Se a 
iO = Ta RB * 1472’ 
r"(t) = (14+t?)x2-—2tx2¢  2(1-1?) 
7 (1 + #2)? ~ +8)" 


(b) At the point t = 0, we have f(0) = In(1) = 0, f’(0) =0 and f”(0) = 2. 
So the required second-order Taylor polynomial is 
1 
p(t) =0+0xt+ 52x (¢-0)? =#°. 
(c) At the point t = 1, we have f(1) = In(2), f’(1) =1 and f”(1) =0. So 
the required second-order Taylor polynomial is 
1 
p(t) = In(2) +1 x (¢-1) + 50x (¢-1)° 
= In(2)-—1+t. 
Note that this is the second-order Taylor polynomial, even though it is 
a polynomial of order 1. This is because it takes account of the second 
derivative f’’(1), even though this turns out to be equal to zero. 
Solution to Exercise 22 


Setting « = a and y = b in equation (37), we see that 
h(a, b) = p2(a,b) = f(a, b), so the polynomial and the function have the 
same value at (a,b). 


Taking first-order partial derivatives on both sides of equation (37) gives 


Pig (ay) = ope = fx (a, b) = Fela; b)(x = a) 7 Flas b)(y _ b), 


ied wee = fy(a,) + fey(a, b)(e — a) + fyy(a,d)(y —d), 


so at x =a and y = b we have h, (a,b) = fz(a,b) and h,(a,b) = fy(a, 6), 
confirming that the first-order partial derivatives match. 


Finally, differentiating again to get the second-order partial derivatives 
gives 


Nea(@,Y) = fea(a,b),  Ryy(@sy) = fyy(@,b),  Rey(a,y) = fry(a, 6). 
In particular, we see that hzz(a,b) = frx(a,b), hyy(a,b) = fyy(a,b) and 
hay(a,b) = fry(a,b), so the second-order partial derivatives match as well. 
Solution to Exercise 23 
We have 

f(x,y) =a27e", f(x,y) =2xe*¥, fy (a, y) = 3x7. 
So at (2,0), we have 

f20=4, 22054 FOG 


Hence the first-order Taylor polynomial about (2,0) is 
pi(z,y) = f(2, 0) + fe(2, 0) (x — 2) ra Fy (2, OC am 0) 
=4+4(4-—2)+12y 
= —4+ 4¢ + 12y. 


The second-order partial derivatives are 

Felt) = 2e7¥, Fig ts y) = 6xe*", Fi Za) = 9x76. 
So at (2,0), we have 

Faa(2,0) = 2, tey(2,0) = 12, Fyy(2, 0) = 36. 
Hence the second-order Taylor polynomial is 

p2(x,y) = pi(x,y) = - (fac: 0) (x >= 2)? + 2 foy(e — 2)(y ~ 0) 

a Fay(2; O)(y _ 0)7) 
= —4+ Ag + 12y + (2 — 2)? + 12(a — 2)y + 18y?. 


Solution to Exercise 24 
Partially differentiating with respect to x and with respect to y, we obtain 
fa(z,y) =6x-—4y+4 and f,(z,y) = —4x + 4y—8. 
Setting these first-order partial derivatives equal to zero gives 
6x — 4y = —4, 
—4r+4y = 8. 


These equations have the unique solution 7 = 2, y = 4. So (2,4) is the 
only stationary point. 


Solution to Exercise 25 


We can write f(x,y) = 2?y+ xy? — 3zy. Partially differentiating with 
respect to x and with respect to y gives 


fo(z,y) = 2ay + y? — 3y = y(2e + y — 3), 
fy(a,y) = 2? + xy — 3x = x(x + Qy — 3). 
Setting these first-order partial derivatives equal to zero gives 


x(x + 2y — 3) =0. 


Solving the first equation for y gives either y = 0 or y = 3— 2x. For y = 0, 
the second equation becomes «(a — 3) = 0, which is satisfied by x = 0 and 
x = 3. For y = 3 — 2z, the second equation becomes x(3 — 3x) = 0, which 
is satisfied by x = 0 (for which y = 3) and x = 1 (for which y = 1). 
Collecting together the complete set of solutions, the stationary points 
occur at (0,0), (3,0), (0,3) and (1,1). 


Solutions to exercises 
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Solution to Exercise 26 
The first-order partial derivatives are 
fe =4e-—y-3 and fy =—x—6y+7. 


Setting these equal to zero, we obtain the simultaneous equations 


4x - y=3, 

—x — 6y = —7, 
which have the solution x = 1, y = 1. So the only stationary point is at 
Ch 1%. 
The second-order partial derivatives are frz = 4, fyy = —6 and fry = —1. 


Evaluating these constant functions at (1,1) then gives f,.(1,1) = 4, 
fyy(1, 1) = —6 and fry(1,1) = —1. So the Hessian matrix at (1,1) is 


4 —1 
e-[4 2] 
The eigenvalues satisfy 


4—xX —1 


0 = det Ht ~ Af) =| ae 


| = (A - 4)(A+6) -1. 
So \? + 2 — 25 = 0, and the eigenvalues are 
—2+ /104 
y= ERVIN = 14 v06. 


These have opposite signs, so the stationary point is a saddle point. 


Solution to Exercise 27 
In Example 11, the Hessian matrix at the stationary point (0,0) is 
—2 0 
Ho oh 
so det H = 4> 0 and f,.(0,0) = —2 < 0. By the determinant test, (0,0) is 
a local maximum. 
In Exercise 26, the Hessian matrix at the stationary point (1,1) is 
4 -1 
oe 
so det H = —25 < 0. By the determinant test, (1,1) is a saddle point. 


Solution to Exercise 28 
The first-order partial derivatives of f(x,y, z) are 
fe =Se+2y; fy=sy+2e, Je = 22. 


The set of simultaneous equations f; = 0, fy =0 and f, = 0 has the 
unique solution « = y = z = 0, so the only stationary point is at (0,0, 0). 


The non-zero second-order partial derivatives are 
a= 2 fyy = 5, f= 2, Say = fyx = 2. 
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These are constants, so the Hessian matrix at the stationary point is 


5 2 0 
H=|2 5 0 
0 0 2 


The eigenvalues of this matrix are given by the characteristic equation 
5—A 2 0 
a) 2 5-A 0 
0 0 2—X 
= (5 — A)(5 — A)(2 — A) — 2(2)(2 — d) 
=(2=— 4)? — 104 + 21) 
= (2—A)(A—3)(A— 7), 
so the eigenvalues are 2, 3 and 7. Since these are all positive, the 
eigenvalue test tells us that (0,0,0) is a local minimum. 


Solution to Exercise 29 


(a) The non-zero second-order partial derivatives are fr, = 2, fyy = 4, 
fez = 6. These values are constants, so at the stationary point (0, 0,0) 
the Hessian matrix is 


2 0 0 
H=/0 4 0 
0 0 6 


This is a diagonal matrix, so its eigenvalues are 2, 4 and 6. These are 
all positive, so the eigenvalue test tells us that the stationary point is 
a local minimum. 


(b) By an argument similar to that in part (a), the eigenvalues are 2, 4 
and —6. These have mixed signs, so the stationary point is a saddle 
point. 


c) The non-zero second-order partial derivatives are fre = 2, fyy = 4, 
yy 
fez = 3627. At the stationary point (0,0,0), these have values 2, 4 
and 0, giving the Hessian matrix 


2 0 0 
H=j|0 4 0 
0 0 0 


This is a diagonal matrix, so its eigenvalues are 2, 4 and 0. These 
include a zero eigenvalue, so the eigenvalue test is inconclusive. In 
fact, this stationary point is a local minimum, but that is not revealed 
by the eigenvalue test. 


(d) The non-zero second-order partial derivatives are fre = 2, fyy = 4, 
fez = —3627. At the stationary point (0,0,0), these have values 2, 4 
and 0, giving the same Hessian matrix as in part (c). The eigenvalue 
test is again inconclusive. In fact, this stationary point is a saddle 
point, but that is not revealed by the eigenvalue test. 


Solutions to exercises 
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Solution to Exercise 30 
We form the function 
L(a,y) = 5a — 38y — A(x? — y?), 


and calculate its first-order partial derivatives D, = 5 — 2A” and 
Ly = —3+ 2Ay. Setting these equal to zero and using the constraint 
equation gives 


i-_-De=0 -8409=0, 277? =1, 


Eliminating A from the first two equations gives 5y — 3x = 0, so y = 32/5. 
Substituting this into the constraint equation, we obtain «?(1 — 9/25) = 1, 
so x = +5/4. We obtain the corresponding values of y from the equation 

y = 3x/5. So there are two stationary points: (5/4,3/4) and (—5/4, —3/4). 
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Unit 8 


Multiple integrals 


Introduction 


The previous unit introduced functions of two and more variables, and 
explained how to differentiate them. In this unit we will integrate such 
functions. 


Suppose that a thin rod lies along the x-axis, with one end at x = 0 and 
the other end at « = L. If the rod has a uniform composition, its linear 
density \ (its mass per unit length) is constant everywhere along its 
length, and the total mass of the rod is M = XL. 


A more interesting case arises when the rod is non-uniform (Figure 1). In 
this case the linear density A(x) is a function of position, and a small 
element of the rod, centred on x and of length 6x, has mass 


OM ~ X(x) dz. (1) 


Strictly speaking, this is an approximation because we have ignored any 
variation of \(x) within the element, but the approximation becomes 
increasingly accurate as the element becomes smaller. 


The total mass of the rod can be found be adding together the masses of 
all its elements. We consider this sum in the limit where the number of 
elements tends to infinity and the length of each element becomes 
vanishingly small. In this limit, any approximation involved in 

equation (1) becomes negligible, and the sum becomes the definite integral 


re 
m= [ A(z) dz. (2) 


To find the total mass of the rod, we calculate this integral using the 
standard rules of calculus. The integral takes a formula for each tiny part 
and gives an answer for the whole. Not surprisingly, the word integral 
comes from the medieval Latin integralis meaning ‘forming a whole’. 


Crucially for this unit, the tiny elements need not be straight-line segments 
laid end to end. They could be rectangular elements covering a surface, or 
tiny brick-shaped elements filling out a volume in three-dimensional space. 


Suppose that an oval metal plate of non-uniform composition lies in the 
xy-plane. In this case each point on the plate can be labelled by its (2, y) 
coordinates, and the non-uniform composition of the plate can be 
characterised by a surface density function f(x,y), which represents 
the mass per unit area at any given point (z, y). 


The plate can be approximately covered by tiny rectangular area elements 
aligned with the xz- and y-axes, as shown in Figure 2. With rectangular 
elements the coverage is only approximate, but the approximation is a 
good one if the elements are small enough. 


A typical area element is centred on the point (x,y) and has sides of length 
dx and dy. This element has area 6A = dz dy, and its mass is 


6M ~ f(x,y) 6A, (3) 


Introduction 


0 


Figure 1 A rod with linear 
density A(x). The mass 
contributed by an element 
centred on x and of length 6x 


is A(x) dx. 


This equation may be regarded 
as the definition of the linear 
density A(x). 


Wf ba f(x,y) 
— 
{oy 
z 
| |__ 


Figure 2. A typical 


rectangular area element 
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A cuboid is a rectangular box. 
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where an approximation sign is used because the surface density may vary 
slightly within the element. However, the approximation becomes 
increasingly accurate as the element becomes smaller and smaller. 


The total mass of the plate is the sum of the masses of all its elements. We 
consider this sum in the limit where the number of elements tends to 
infinity and the area of each element becomes vanishingly small. In this 
limit, the total area occupied by the rectangular elements becomes 
identical to the area occupied by the plate, and any approximation 
involved in equation (3) becomes negligible. This is just like the process 
that led from equation (1) to equation (2), so the sum over all the tiny 
elements can be regarded as an integral over the area of the plate. The 
mass of the plate is written as 


M= [ f(x,y) aA, 


where the subscript S on the integral sign indicates that elements exactly 
cover the surface S of the plate. This expression is called an area integral. 
Of course, we have not told you how to evaluate such an integral, but 
Section 1 will explain how this is done: by performing two definite 
integrals in succession — one over x and the other over y. 


The ideas behind this example can be extended. For instance, instead of 
integrating over a region in a plane, we can integrate over a curved surface, 
such as the almost spherical surface of the Earth. If you know the 
population density (the average number of people per unit area) at each 
point on Earth, the total human population is found by integrating this 
population density over the surface of the Earth. Such an integral is called 
a surface integral (rather than an area integral, which is a term that is 
restricted to planar surfaces). 


We can also consider three-dimensional objects rather than surfaces. 
A three-dimensional object can be approximated by many tiny cuboid 
volume elements aligned with the x-, y- and z-axes. A typical volume 
element has coordinates (x, y, z) and sides of length 6x, dy and 6z. Its 
volume is dV = 6x dy 6z, and its mass is given by 


6M ~ f(x,y, 2) oV, 


where f(z, y, z) is the density of the object (its mass per unit volume) at 
the point (x,y,z). The total mass of the object is the sum of the masses of 
all the volume elements. We consider this sum in the limit of infinitely 
many volume elements, each of vanishingly small volume. In this limit, the 
sum can be expressed as an integral over the volume of the object. We 
write 


Me = [ fen dV. 


where the subscript R shows that the volume elements exactly cover the 
region R occupied by the object. Such an expression is called a volume 
integral. We have not told you how to evaluate such an integral, but 
Section 2 will show how this is done: by performing three definite integrals 
in succession — over x, y and z. 


The above examples used Cartesian coordinates, but it always possible, 
and often preferable, to use non-Cartesian coordinates. For example, a 


given area integral can be evaluated using either Cartesian coordinates or 


polar coordinates. All coordinate systems give the same answer, but one 
choice may make life easier than another, and part of the skill of 
evaluating area, surface and volume integrals is to choose a suitable 
coordinate system. You will see how to use non-Cartesian coordinates in 
the second half of this unit. 


Uses of surface and volume integrals 


Scientists and engineers often need to evaluate area, surface or volume 
integrals. For example, Figure 3 shows the Hoover Dam in the 
Colorado River, built in the 1930s to provide irrigation and 
hydroelectric power. The quantity of concrete used in this structure 
can be calculated using a volume integral (it is 2.5 million cubic 
metres). If the curved surface of the dam were to be painted, we could 
work out the area to be covered by evaluating a surface integral. 


In general, volume and surface integrals are useful whenever we have 
a quantity, such as mass or volume, that is additive. An additive 
quantity is one whose value over a region is equal to the sum of 
contributions from the region’s constituent parts. For example, the 
mass of any object subdivided into many volume elements is the sum 
of the masses of these elements. This is what allows us to express the 
mass of the object first as a sum, and then as an integral. 


There are many other physical quantities that are additive, including 
electric charge, energy and particle number. Each of these quantities 
can be characterised by a density, which may be per unit area or per 
unit volume. For example, we may talk about the energy density in 
the Sun at a given time (Figure 4). The total energy is found by 
integrating this energy density over the volume of the Sun. We may 
also talk about the number density of bacteria per unit area on the 
surface of a laboratory dish. If this number density is modelled by a 
smooth function of position, then the total number of bacteria on the 
surface of the dish is found by integrating the number density over 
the surface of the dish. 


Not all physical quantities are additive. For example, we cannot say 
that the temperature of an object is the sum of the temperatures of 
its parts, and there is no meaningful physical quantity corresponding 
to the temperature per unit volume. Nevertheless, if temperature is a 
function T(x, y,z) of position, we may integrate this function over a 
region, and divide by the volume of the region to give a measure of 
the average temperature in the region. 


Introduction 


Figure 3. The Hoover Dam 


Some scientists use the term 
extensive instead of additive. 


Figure 4 The surface of the 
Sun imaged by NASA’s Solar 


Dynamics Observatory 


69 


Unit 8 Multiple integrals 


0 3 ax 


Figure 5 A rectangular slab 
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Study guide 


This unit shows you how to evaluate surface and volume integrals by 
performing two or three definite integrals in succession. Some of the 
techniques of integration described in Unit 1 will be used. The unit also 
uses properties of vector products and determinants that were covered in 
Unit 4, and the chain rule of partial differentiation, which was introduced 
in Unit 7. 


Section 1 explains how to evaluate area integrals using Cartesian 
coordinates, and Section 2 deals with volume integrals in Cartesian 
coordinates. 


Section 3 introduces several non-Cartesian coordinate systems and shows 
how they are used to simplify the evaluation of area and volume integrals. 
Section 4 gives a review of different types of coordinate system. It unifies 
all the discussion given earlier in the unit by introducing two important 
new concepts: scale factors and Jacobian factors. Finally, Section 5 
discusses surface integrals over curved surfaces. 


1 Area integrals in Cartesian 
coordinates 


In this section, we consider area integrals of functions f(x,y), where x 
and y are the Cartesian coordinates of points in a plane. The regions over 
which these functions are integrated will also be specified in Cartesian 
coordinates. 


1.1 Area integrals over rectangular regions 


We begin with a simple case. Given a function f(x,y) of Cartesian 
coordinates x and y, we will show how to integrate it over a rectangular 
region in the xy-plane. 


Figure 5 shows a rectangular slab in the xy-plane. This could be a small 
courtyard, for example. One corner of the slab is at the origin, and the 
slab extends to x = 3 and y = 2 (measured in metres). The slab is 
unevenly covered with a layer of snow whose mass per unit area (measured 
in kilograms per square metre) at the point (z, y) is 


f(a,y)=ay? for0<2<3and0<y<2. (4) 


The function f(x,y) is defined everywhere on the surface of the slab, and 
represents the surface density of the snow. We can find the total mass of 
snow on the slab by carrying out a suitable integral. 


1 Area integrals in Cartesian coordinates 


First, consider the narrow shaded strip in Figure 6, which is centred on 

y = yo and has width dy. This strip is made up of many tiny rectangular 
elements placed end to end, parallel to the z-axis, just as in Figure 1. The 
mass of a single element in the strip, with linear dimensions 6x and dy, 
centred on the point (z, yo), is found by multiplying the surface density by 
the area of the element. From equation (4), the surface density at (x, yo) is 
f(x, yo) = xy§, so the mass of the element is 


bd Melement = LY ox by. (5) 


An approximation sign is used here because the surface density varies 
within the element. However, an equals sign will be used from now on, on 
the understanding that any error will become negligible when we take the 
limit of vanishingly small elements and integrate. 


We can calculate the mass of the strip in Figure 6 by integrating along the 
x-axis, from one end of the strip to the other. All elements in the strip are 
centred on y = yo, so this value of y remains constant during the 
integration. Integrating equation (5) from x = 0 to x = 3, we get 


x2=3 
OMarip= (feu ae) by = [$2°u8] S25 bv = Bu oy 
= 
Notice that the integration has been with respect to x, with y held at a 
constant value y = yo. This is reminiscent of partial differentiation, where 
we differentiate with respect to x, while holding y constant. In effect, we 
have partially integrated the function f(x,y) with respect to x (although 
this terminology is rarely used in practice). Notice too that the limits in 
the integral have been written explicitly as x = 0 and x = 3, rather than 
simply as 0 and 3. This is a useful precaution when dealing with a function 
that depends on more than one variable. 


We denoted the constant value of y by yo to emphasise the fact that the 
value of y was held constant during the integration with respect to x. 
Nevertheless, our formula for the mass of a strip is valid for any value 

y = yo within the region of the slab. We can therefore replace yo by y to 
give the mass of snow in any narrow strip of width dy, centred on y: 


dMstrip = Sy? by. (6) 


To find the total mass of snow on the slab, we must add up contributions 
from all the narrow strips between y = 0 and y = 2 (see Figure 7). We do 
this in the limit of vanishingly thin strips, which allows us to replace the 
sum by a definite integral. Equation (6) gives the mass of a strip, and 
9y/2 is the corresponding mass per unit length in the y-direction. To find 
the total mass of snow on the slab, we integrate 9y?/2 with respect to y, 
from y = 0 to y = 2. This gives a total mass of 


uP a. 9 71,3) ¥=2 
ma fT xy dy = 5 [3y"] 9 = 12, 


so we conclude that the total mass of snow on the slab is 12 kilograms. 


Yo 


0 3 « 
Figure 6 A narrow strip 
parallel to the z-axis is made 
up of many tiny elements 


0 3 


Figure 7 ‘The slab is 
composed of many narrow 
strips 
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In some texts, the brackets 
enclosing the inner integral are 
omitted. This is on the strict 
understanding that the 
innermost integral is always 
evaluated first. 


We assume here that a < b and 
c<d. 


Ov 


0 20 3 


Figure 8 A narrow strip 
parallel to the y-axis 
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Once the principles behind this calculation are understood, it just amounts 
to doing two integrals in succession, one with respect to x and the other 
with respect to y. We can write this double integral as 


y=2 e=3 
M= / (/ zy” i) dy, 
y=0 «=0 


where the definite integral in brackets is calculated first, treating y as a 
constant. After the inner integral has been evaluated, and the upper and 
lower limits for x have been applied, we have a function of y only. This 
function is integrated with respect to y, and the final answer is obtained by 
applying the limits for y. This procedure is readily generalised. 


Area integral over a rectangular region 


The area integral of a function f(x,y) over a region S is denoted by 


[ ee Ge 


If S is rectangular and bounded by the lines x = a, x = b and y =c, 
y = d, the area integral can be obtained as two successive integrals: 


[ few aa= iL e ( i ic ar) dy. (7) 


In the integral over x we treat y as a constant. 


In equation (7), there is an inner integral (enclosed by brackets) and an 
outer integral. The inner integral is always performed first. But there is 
nothing to prevent us from doing things the other way round, integrating 
first with respect to y and then with respect to x. 


Returning to the example of the mass of snow on a slab, we could equally 
well begin by finding the mass of a narrow strip parallel to the y-axis (such 
as that in Figure 8), and then find the total mass of all such strips. 


Integrating the surface density xy? first with respect to y and then with 
respect to x gives 


@=3 y=2 
M= / ¢ ry” iv) dx 
«=0 y=0 


-["'s i= 8 [$a?]"— = 8 % 2 =19 
~ Jap oY leo FB 
which is the same as our previous answer. This is just as expected: our 
decision to subdivide the rectangular slab into thin horizontal strips (as in 
Figure 7) or into thin vertical strips (as in Figure 8) cannot affect the 
amount of snow on the slab. 


1 Area integrals in Cartesian coordinates 


More generally, the area integral in equation (7) can be rewritten as 


[tenaa= [ _(fL, fenas) den. (8) 


Although the ordering of the integrations makes no difference to the final 
answer, one ordering may involve easier integrations than the other. You 
are free to choose whichever order makes the calculation easier. 


Example 1 


Find the value of the area integral of the function f(x,y) = ycos(xy) over 
the rectangle S bounded by the lines « = 0, x = 2 and y= 7/2, y=7. 


Solution 


The region of integration is shown in Figure 9. Choosing to integrate over 
x first, the required area integral is 


y= L=2 
| ycos(xy) dA = / (/ y cos(xy) ir) dy. 
S y=r/2 x=0 


The integral in brackets may look tricky but it is easily evaluated because, 
when integrating over x, we treat y as a constant. For constant y we have 
22 


| “oy oo = Gintdy, 


=0 7] 2=0 
So 


i, y cos(xy) d A= fr sin(2y) d 
s y= 


= [-} co —— 
—4 cos(2m) + 5 cos(m) = —1. 


The negative answer is not a problem: it arises because the function 
f(x,y) = ycos(zy) is more negative than positive in the given region 9. 


In the above example, the decision to integrate first with respect to « was 
key. In theory, we could have integrated first with respect to y, but the 
integrals would then have been harder. There is no merit is taking a tough 
route when an easier one lies open, so be prepared to switch the order of 
integration if your first choice leads to an impasse. 


Product functions integrated over rectangular regions 


A special case arises when the function to be integrated over a rectangle 
takes the form f(x,y) = h(x) g(y), which is the product of a function h(x) 
of x only and a function g(y) of y only. In this case, the area integral of 
h(a) g(y) over a rectangular region is simply the product of two ordinary 
integrals: 


[- ([7 meow tr) dy = (fom ar) x (fo iv) 


wlA 


0 5 i 


Figure 9 The region of 
integration for Example 1 


Remember: if a is a constant, 


+C. 


asin(ax) 


fo cos(ax) dx = 
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For example, integrating f(x,y) = ry? over the region of Figure 5, we have 


foconda= (7 eae) (f0) ve) 


3 =2 
= [32°] 29 * [39°] y-0 = 3 x 3 = 12, 


in agreement with both our earlier calculations. 

However, you must be very careful: splitting an area integral into two 
factors works only under very special circumstances. 

e The integrand f(x,y) must be a product of the form h(x) g(y). 


e All the limits of integration must be constants: for Cartesian 
coordinates (x,y), this means that the region of integration must be a 
rectangle with its edges aligned with the x- and y-axes. 


Other area integrals cannot be split in this way. For example, if the 
integrand is a sum of the form g(x) + h(y), the area integral does not split 
into the sum of an integral over x and an integral over y (see Exercise 2). 


Exercise 1 


Find the value of the area integral of the function f(x,y) = 2?y? over the 
square S bounded by the lines x = 0, # = 2, y=1andy=3. 


Exercise 2 


Evaluate the area integral of the function f(x,y) =1+2+y over the 
rectangle S bounded by the lines 7 = 1, 7 = 4, y=0 and y = 3. 


Exercise 3 


Evaluate the area integral 


i= | cos(x + y) dA, 
Ss 


where S is the square region 0<a<7,0<y<7. 


1.2 Area integrals over non-rectangular regions 


The area integrals considered so far have all been over rectangular regions 
aligned with the coordinate axes. This made it easy to determine the 
limits of integration. For non-rectangular regions we must be more careful 
in setting up the integrals because the strips are no longer of the same 
length and the limits on the inner integral depend on the variable in the 
outer integral. To illustrate this, consider the following example. 


Example 2 


Find the value of the area integral of the function f(x,y) = xy over the 
region S bounded by the curves y = x? and y= <2 for0<a <1. 


1 Area integrals in Cartesian coordinates 


Solution 


We begin by drawing a diagram of the region of integration (Figure 10) and 
choose to integrate first over y and then over x. To determine the limits of 
the integration over y, consider a vertical strip drawn at an arbitrary fixed 
value of x within the given region. The ends of the strip lie on the curves 
y =x? and y =«. So for a given value of x, the lower limit for the 
y-integration is y = x”, and the upper limit is y = x. These limits are the 
right way round because xv? <za«for0<a< 1, as shown in the diagram. 


The contribution to the area integral from the narrow vertical strip of 
width 6x, centred on x, is found by integrating along the length of the 
strip: 


y=ax 
contribution of strip = ( / 
y 


ry iv) Ox, 


=£ 
where « has a fixed value for a given strip. 

A subsequent integration over x sums over all the vertical strips in the 
region. In this example, the first strip is at x = 0 and the last strip is at 


x = 1. Hence the lower and upper limits for the z-integration are x = 0 
and « = 1, and the complete area integral is 


z=1 y=xz 
| cydA = (/ xy iv) dx. (9) 
S a=0 yaa? 


The integral enclosed by brackets is carried out first. This is with respect 
to y and is evaluated by treating x as a constant, giving 


y=r = 
/ wy dy = [jay] = 5(2° — 2”). 
y=a? 


The result is a function of x only. This is finally integrated over x to give 
the area integral: 


x=1 
[evaa = : 5 (x? — 2°) dx = 4 [<x* - 16) =5- 
r= 


Let us review this example. We started by drawing a diagram that helped 
us to find the limits of integration. This is an essential step. We chose to 
integrate first with respect to y, with x held constant. The limits of the 
inner y-integral were functions of x, and the limits of the outer x-integral 
were constants. The area integral was then found by two successive 
integrations, the first over y (with x held constant) and the second over x. 


Although the integrand in this case is a product of a function of x and a 
function of y, the area integral does not reduce to the product of two 
ordinary integrals. This is because the limits of integration are not all 
constants. In equation (9), the limits of the y-integral depend on x, so we 
must do this integral first, and then integrate the result over x. 


The general method is readily extended to other area integrals. 


0 x I 
Figure 10 The region of 
integration for Example 2 
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Procedure 1 Evaluating an area integral 


To evaluate an area integral f g f(x,y) dA over any given region S' of 
the xy-plane, we must decide which integral to do first — that over x 
GR or that over y. The following steps assume that we have chosen to 
integrate first with respect to y, and then with respect to 2. 


1. Draw a diagram showing the region of integration S. 


2. Draw a vertical strip parallel to the y-axis, centred on x, and 
— spanning the region (as in Figure 11). Determine the lower limit 
a L a: y = a(x) and the upper limit y = G(x) for this strip. These are 
the limits for the y-integration (the inner integration). For 
non-rectangular regions, they are non-constant functions of x. 


Figure 11 The region of 
integration for an area 


integral. The y-limits are 3. Determine the minimum value x = a and the maximum value 
given by the equations x = b for x-values throughout the region. These are the limits for 
y = a(x) and y = f(x) of the the x-integration (the outer integration), and are always 


boundary curves. carer ante 


4. Write down the area integral as 


[ienaaq | a ( / few i) ae. (10) 


5. Evaluate the inner integral over y first, holding x constant, and 
substituting in the limits of integration. This gives a function 


(a) 
g(x) = i f(a, y) dy, 
y 


=a(z) 
which remains to be integrated over a. 


6. Evaluate the remaining definite integral of g(x) over x. 


q Step 2 of this procedure may sometimes fail. In Figure 12, for example, a 
> strip parallel to the y-axis reaches the boundary of the region before it has 
spanned the whole region. Such cases are dealt with by breaking the region 


Figure 12 A region for into smaller parts, but you will not meet this complication in this module. 
which Procedure 1 fails. The 


problem is overcome by 
dividing the region into parts, Exercise 4 
as shown by the dashed line. 


Evaluate the area integral of the function f(x,y) = x — y over the 
triangular region S bounded by the lines y= a2—1,%=3 and y=0. 


The simplest application of an area integral is to find the area of a given 
region S in the ry-plane. The area is given by integrating the constant 
function f(x,y) = 1 over the region. 


Area of region S = i 1dA. (11) 
S 
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1 Area integrals in Cartesian coordinates 


Exercise 5 
Find the area of the region between the curve y = cos and the straight Note that cosxz > 1 — 2a/z for 
line y = 1 — 22/7, for 0 <a < 1/2. O0<a<7/2. 


In Procedure 1 we chose to organise the area into vertical strips and 
integrate first with respect to y. We can also organise the area into 
horizontal strips and integrate first with respect to x. In effect, we would 
then continue to use Procedure 1 but with x and y interchanged. In this 
alternative ordering the area integral takes the form 


y=b r=v(y) 
i) f(x,y) dA = / . ( | “vg SO i) dy. (12) 


Here, the inner integration is over x with limits of integration that are 
functions of y, and the outer integration is over y with limits of integration 
that are constants. The inner integration is carried out first; it yields a 
function of y, which is then integrated to produce the final answer (which 
does not depend on x or y). For well-behaved functions, the final answer 
does not depend on whether we divide the area into vertical or horizontal 
strips, so equation (12) gives the same answer as equation (10); the choice 
of method is just one of convenience. 


In the special case of an area integral over a rectangular region aligned 
with the coordinate axes, the limits of integration are all constants. But 
usually, the limits of the inner integral are not constants. This has an 
important consequence for calculations. 


Rethinking the limits of integration 


If we choose to do the z-integral first, as in equation (12), then the 
limits of integration must be rethought from scratch. This is done by 
drawing a new sketch of the region, with horizontal strips running 
parallel to the x-axis, rather than vertical strips parallel to the y-axis. 


Let us return to the area integral of f(x,y) = x — y over the triangle S$ 
bounded by the lines y = x — 1, x = 3 and y = 0. This was evaluated in 
Exercise 4 by integrating first over y and then over x. Now we will evaluate 
the same integral, but with the integrals performed in the opposite order. 


A new sketch is needed, and this is given by Figure 13, which shows a yA 
typical horizontal strip across the region of integration. This stretches 
from « =y+1tox#=3. The minimum and maximum values of y are 
y = 0 and y = 2, and these are the lower and upper limits of the 
y-integration. So the area integral is expressed as 


fewer [LE e-na | 


Figure 13 


RY 
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This area integral looks different to that in Exercise 4, but it leads to the 
same answer. Integrating over « while treating y as a constant gives 


y=2 
=3 
[a x,y) dA = [, [52 Psy | Lae dy 
-{" ' (5 — 8y — 5(y +1)? + y(y +1) dy 
y= 


y=2 
= : (4 — 3y + dy”) dy. 
The final integral over y then gives 
[fen )dA= [4y — By? +4 os 2 
=8-64+95=4, 


as before. 


Exercise 6 


Sketch the areas of integration for each of the following area integrals, and 
write down alternative expressions for the same integrals, but with the 
order of integration reversed. 


@) | 7 ( / fen) i 
0) | 7 ( : _ f(2,v) i) da 


Exercise 7 


The shaded region in the figure below is bounded by the y-axis and the 
curve g =4—y". 


y 


Find the area of this region by integrating first over 2 and then over y. 


2 Volume integrals in Cartesian coordinates 


Exercise 8 


Evaluate the area integral of the function f(x,y) = x over the shaded 
region S$ in the figure below, which is a quarter-disc x? + y? < 1 with x > 0 
and y > 0. 


y 


1 


0 1 ax 


(Hint: In this case, it is easier to integrate over x first, and then over y.) 


Exercise 9 


Evaluate the area integral of f(z, y) = exp(x”) over the triangular region 

in the figure below, which is enclosed by the lines x = 1, y=0 and y= az. 
YA 
1- Wet Jeecide Bees tae de eae 


0 1 x 


(Hint: The integrals over x and y can be written down in either order, but 
only one of these orderings gives integrals that can be done!) 


2 Volume integrals in Cartesian 
coordinates 


We now consider volume integrals. This section uses Cartesian coordinates 
to evaluate volume integrals over cuboid and some non-cuboid regions. 
The approach is similar to that used for area integrals, but there are now 
three coordinates, x, y and z, to consider, and three integrals to perform in 
succession. 


79 


Unit 8 Multiple integrals 


bed 


Figure 14 A cuboid 


Figure 15 A narrow column 
within a cuboid 
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2.1 Volume integrals over cuboid regions 


Suppose that we want to find the total mass of an object from its density. 
Within the object, the density is described by a function f(x,y, z) of 
Cartesian coordinates (x,y,z). This function allows us to find the mass of 
a small volume element. An element of volume 6V, centred on the point 
(x,y,z), has mass 


6M = f(z,y,z) OV. 


Strictly speaking, this is an approximation because the density function 
may vary slightly within the volume element, but if the volume element is 
small enough, any such variation is negligible. 


Now imagine subdividing the object into a vast number of tiny 
non-overlapping volume elements. The total mass of the object is the sum 
of the masses of its elements. Rather than adding up an immense number 
of small masses, we take the limit of the sum as the volume of each 
element tends to zero, and the number of elements tends to infinity. In this 
limit, the sum becomes an integral and the approximations become exact. 
The mass of the object can then be written as a volume integral: 


M = [ fens dV. 


where R is the region occupied by the object, and f(x, y, z) is the density 
function. 


In this subsection, we explain how volume integrals are calculated in the 
special case where the region of integration is a cuboid (i.e. a rectangular 
block). We choose a Cartesian coordinate system with axes aligned with 
the faces of the block (Figure 14). These faces lie in the coordinate planes 
T=a,,0=a2,y=b1, y= bo, z =c and z = cg, where aj, a2,...,C2 are 
constants. (Here we use subscripts to distinguish the six constants that 
define the cuboid. This is neater than using six different letters of the 
alphabet.) 


Suppose that we want to integrate the density function f(x, y, z) over this 
cuboid. We start with a tiny volume element, which is a tiny block with 
linear dimensions 62, dy and 6z, centred on the point (z,y, z). The volume 
of this element is 6V = 6x dy 6z, and its mass is 


mass of element = f(z, y, z) 6x dy dz. 


Figure 15 shows how similar tiny elements can be stacked on top of one 
another to produce a vertical column extending from the bottom to the 
top face of the cuboid. All the tiny blocks are centred on the same values 
of x and y, but the value of z varies from z = c,; on the bottom face to 

Zz = c2 on the top face. The mass of the column is the sum of the masses of 
the volume elements that it contains, and in the limit of very small volume 
elements this can be expressed as an integral: 


Z=cQ 
mass of column = ( f(x,y, 2) iz) bx Oy, 


=Cc] 


2 Volume integrals in Cartesian coordinates 


where x and y are held constant during the integration over z. The mass 
of the column remains a function of x and y because the density function 
may vary from point to point. 


Next, we stick many columns together to form a slice running across the 
cuboid at constant x (Figure 16). The mass of the slice is the sum of the 
masses of the columns that it contains, and in the limit of very narrow 
columns this can be expressed as the integral 


y=b2 Z=C2Q 
mass of slice = (| (/ f(x,y, z) iz) av) Ox, 
y=bi 2=C] 


where zx is held constant during the integration over y. The mass of each 
slice is a function of x only. 


Finally, we join all the slices together to form the complete cuboid. The 
mass of the cuboid is the sum of the masses of the slices that it contains, 
and in the limit of very thin slices this can be expressed as 


v=a2 y=be Z=c2 
mass of cuboid = / (/ (/ 7 (8052) az) av) ae. (13) 
w=ay, y=bi Z=c1 


This is a typical volume integral over a cuboid. There is no need to go 
through the steps of dividing the cuboid into blocks, columns and slices 
every time. You can go straight to the conclusion given by equation (13). 
More generally, the volume integral of any function f(x,y, z) over a cuboid 
is given by a similar expression. 


Volume integral over a cuboid 


Given a function f(x,y, z) of Cartesian coordinates x, y and z, and a 
cuboid region R, with faces lying in the coordinate planes x = ay, 

L= ao, y = 01, y = bo, z = cy and z = co, the volume integral of f 
over R is given by 


Gao Aah me aa “ fle.y.2) dz dy ) dz. 
R z=a, \Jy=b) \Jz=c1 


(14) 


If f(x,y, z) represents density (the mass per unit volume), then the volume 
integral in equation (14) gives the total mass contained in the region R. If 
f(x,y, z) = 1, then the integral gives the volume of the region R. 


In all cases, this volume integral is evaluated from the inside out. First, we 
integrate over z, holding x and y constant. Then we integrate over y, 
holding x constant. Finally, we integrate over x. None of this should 
surprise you. It follows exactly the same pattern as for area integrals over 
rectangular regions, but we must now integrate over the three coordinates 
x,y and z, rather than just two. 


Figure 16 A thin slice 


within a cuboid 


We assume here that a, < ag, 


by < bg and cy, < cg. 
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As for area integrals over rectangular regions, the order of integration 
makes no difference to the final answer. For example, the above integral 
can also be written as 


I f(w,y,2) dV = | - ( / - ( [ - fens) i) av) ce 


Although the final result is the same, the effort needed for the integration 
may depend on the choice made. 


For area integrals, you saw that product functions integrated over 
rectangular regions can be expressed as the product of two ordinary 
integrals. A similar result applies to volume integrals. The volume integral 
of a product function f(x,y, z) = u(x) v(y) w(z) over a cuboid region is 
simply the product of three ordinary integrals: 


Z=C2Q y=b2 L=a2Q 
| Jl(f:9;2) dV = / w(z) dz x | u(y) dy x / u(x) da. 
R Z=Cj y=bi @=a, 


But you must be very careful: splitting a volume integral into three 
factors works only under very special circumstances. 


e The integrand f(x,y, z) must be a product of the form u(x) v(y) w(z). 


e All the limits of integration must be constants: for Cartesian 
coordinates (x,y,z), this means that the region of integration must be 
a cuboid with its faces aligned with the z-, y- and z-axes. 


Other volume integrals cannot be split in this way. 


Example 3 


A cube has faces at x = 0,2 =1, y=0, y=1, z =0 and z = 1, where 
lengths are measured in metres. The non-uniform density of the cube (in 
kilograms per cubic metre) is described by f(x,y, z) = 22 + y? + 2”. 
Determine the mass of the cube. 


Solution 


The mass of the cube is given by the volume integral 
M= [ (+ y?+2)av, 
R 


where R is the volume occupied by the cube. Inserting the appropriate 
limits, we have 


z=1 y=1 z=1 
M= (/ ( (a? + y? + 27) iz) in) dx. 
z=0 y=0 z=0 


The inner integral is evaluated first. This is an integral over z with x and y 
treated as constants. We get 


xv=1 y=1 pa 
M= (/ [x*z a ye at 52" | = iv) dx 
x=0 y=0 — 


eal y=1 
=| : (| ; (2? + y? + §) dy) ae 
c= y= 


2 Volume integrals in Cartesian coordinates 


Next, we integrate over y with x treated as a constant. We get 
— 2 3 y=l 
M= [ary + Sy + 34] 0 dx 
x=0 
xr=1 
= / (x? + 2) dr. 
x=0 
Finally, we integrate over x to obtain 


M = [$09 + Bo]2} = 1. 


= 


So the mass of the cube is 1 kilogram. 


Exercise 10 


A cuboid block R has faces at x =0, 4% = 2,y=1, y=2,2=2 and z=5, 
where lengths are measured in metres. The non-uniform density of the 
block (in kilograms per cubic metre) is given by f(x,y,z) =x+y+4z. 
Find the mass of the block. 


Exercise 11 
Show that the function f(x,y, z) = ryz e (@’+¥°+2") can be expressed as a 
product of the form u(x) u(y) w(z). Hence evaluate the volume integral 


P| meer dV, 
R 


where R is a cube with faces atx =0,x=1,y=0,y=1,2=Oand z=1. 


2.2 Volume integrals over non-cuboid regions 


A cuboid region of integration is, of course, an exceptional case. For 
shapes other than cuboids, Cartesian coordinates can still be used, but 
care is needed with the limits of integration. 


Remember what happened for area integrals over non-rectangular regions — 
the limits of the inner integral depended on the variable of integration in 
the outer integral. Something very similar happens for volume integrals 
over non-cuboid regions. 


Volume integral over a non-cuboid region 


The volume integral of a function f(z, y, z) over a non-cuboid 
region R can be written in the form 


a=b ( py=Ble) ( pe=v(o) 
[ tewaav = | / { Jaw, eid Wd |da- 
R “z=a y=a(x) Z=i(Ge)) 


(15) 
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Figure 17 The region of 
integration for Example 4 


zr+ty=1 


0) it x 


Figure 18 Projection of the 
region of integration onto the 
xy-plane 
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This involves three integrations performed successively, from the innermost 
to the outermost. In this case, we integrate first with respect to z (holding 
x and y constant). This gives us a function of x and y, which is integrated 
with respect to y (holding x constant). This finally gives a function of z, 
which is integrated over x. 


The limits for the innermost integral depend on the variables of integration 
x and y in the outer two integrals. The limits for the middle integral 
depend on the variable of integration x in the outermost integral, and the 
limits of the outermost integral are constants. 


In the special case where f(x,y, z) = 1, equation (15) gives the volume of 
the region. 


a2=b y=A(z) z=u(2,y) 
Volume of region = / i / ldz|dy|dz. (16) 
r2=a y=a(z) z=u(x,y) 


Sometimes the trickiest part of the calculation is finding the limits of 
integration that define the region R. This is best done by sketching two 
diagrams, as illustrated in the following example. 


Example 4 


Find the volume integral of the function f(x,y, 2) = 22 over a pyramid R 


whose faces are given by x = 0, y=0, z =O andx+y+z=1. This 
pyramid has vertices at points (0,0,0), (1,0,0), (0,1,0) and (0,0, 1). 


Solution 


The pyramid is sketched in Figure 17. The plane x + y+ z = 1 meets the 
xry-plane in the line x + y = 1, so the triangular base of the pyramid is as 
shown in Figure 18. 


The limits of integration can now be determined from our sketches. Above 
any tiny rectangular area element in the base of the pyramid, we can 
imagine a narrow column extending up to the blue shaded plane in 

Figure 17. By summing over all such columns, we cover the whole region 
of integration inside the pyramid. For a single column, rising from a point 
(x,y), we integrate over z from z = 0 to z= 1—a-—y. From Figure 18, we 
see that a slice at constant « can be produced by integrating over y from 
y=0to y=1-—~2. And to include all the slices, we must integrate over x 
from «=O toxr=1. 


The order of integration can be checked from the nature of the limits. The 
z-integral must be placed innermost because its limits depend on both x 
and y. The y-integral comes next because its limits depend on x. Finally, 
the x-integral is outermost because its limits are constants. The function 
to be integrated is f(x,y, z) = z*, so the required volume integral is 


fener [2([2" (fo 2a) 


2 Volume integrals in Cartesian coordinates 


The hardest part of the problem is over; now we just need to do the three 
integrations. We begin by integrating over z, holding both x and y 


constant: 
aaa 1,3]2=l-#-y 1 3 
/ z* dz = [32°|7_5 * a ( Lay)": 
z=0 
We then integrate this function over y, holding x constant. You can see by 
inspection that when k is a constant, This integral can be checked by 
differentiation. Alternatively, 
[e = y)° dy = —i(k = y)! + constant, you could use the change of 
variable u = k — y. 


so, replacing k by 1 — x, we get 


y=1-z = 
/ ; 3(L-2—y)*dy= [-pd 2-9)" *= d(0-2)*. 
y= 


Finally, we integrate over x to obtain 


c=1 = 
R «z=0 


The same volume integral can be written in a variety of ways, depending 
on how we order the integrals. For example, we could have projected the 
region onto the yz-plane, and chosen to integrate first over x, then over y, 
and finally over z. The volume integral in Example 4 would then be 
written as 


[tensa =f ([", cm , “#ar) av) dz. (17) 


This gives the same answer as the volume integral in Example 4, as you 
can now check. 


Exercise 12 


Verify that equation (17) gives the same answer as that in Example 4. 


Finding the limits of integration is a key step in all problems of this kind, 
and requires great care. Let us review how this is done. It is usually 
helpful to draw two diagrams to visualise the geometry — a perspective 
view of the three-dimensional region of integration and a plan view 
showing the projection of the region onto a coordinate plane (the xry-plane 
in Example 4). 


Focusing on a tiny element with coordinates (x,y) in the projection onto 
the xy-plane, and extending upwards in a column within the region, we 
obtain limits for the inner integration over z. In general, these limits 
depend on x and y. Then we imagine sticking many columns together, 
producing a slice across the region at constant 7. The y-values at the 
extremities of a typical slice are evident in the sketch showing the 
projection of the region onto the xy-plane. These are the limits for the 
middle integral over y, which in general depend on x. Finally, the 
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Figure 19 A hemisphere 


YA 
e+ y? = R? 


Figure 20 Projection of the 
hemisphere onto the ry-plane 


minimum and maximum values of x in the projection onto the xy-plane 
give the constant limits for the outer integral over x. 


When you nest the three integrals to form a triple integral, it is worth 
checking that the following rule is obeyed. 


Rule for the ordering of integrals in a triple integral 


The limits of integration of a given integral can depend only on the 
variables of integration of integrals that lie further outside it (and are 
done after it). The limits of the outer integral are always constants. 


Example 5 


Figure 19 shows a hemisphere of radius R, with its base in the xy-plane, 
centred on the origin. In Cartesian coordinates, points on the curved 
surface of the hemisphere have x? + y? + z? = R?. Write down an integral 
expression that gives the volume of this hemisphere. Do not spend any 
time evaluating the integrals. 


Solution 


Consider Figure 19 and the given equation for the curved surface of the 
hemisphere. For a given element with coordinates (x,y), the limits of the 


inner integration are z = 0 and z = \/ R? — 2? — y?. 


We draw a two-dimensional view of the projection of the hemisphere onto 
the xy-plane. This is the disc of radius R shown in Figure 20. Choosing to 
integrate next over y, we see that for a strip centred on zx, the limits are 

y = —V R? — x? and y= ++ R? — x2. The limits for the final z-integration 
arex = —Randx=+R. 


To find the volume of the hemisphere, we integrate the function 
f(x,y, z) = 1 over the hemispherical region. So the volume is given by 


“z=R y=V R222 os / R222 —y? 
V= / | 1dz | dy ) dz. 
z=—R y=—-Vv R2-22 z=0 


This illustrates the process of finding suitable limits, but the integrals are 
lengthy and will not be done here. Later in this unit you will see that there 
are better methods to use in this case, based on non-Cartesian coordinates. 


Exercise 13 


Find the value of the volume integral of the function f(x,y, z) = x?yz over 
the wedge-shaped region shown in the margin, which is bounded by the 
planes z=0, y=0, 7 =0,2=landy+z=1. 


Sometimes the mathematical statement of a problem already specifies the 
limits of integration, and we can skip the stage of drawing diagrams, as in 
the following exercise. 


3 Using non-Cartesian coordinates 


Exercise 14 
A region in three-dimensional space is defined by 
O<ae<y+2, O<y<z, 0<2K<1, 


where lengths are measured in metres. What is the volume of this region? 


3 Using non-Cartesian coordinates 


In principle, it is possible to use the methods of Sections 1 and 2 to 
evaluate any area or volume integral in Cartesian coordinates x, y and z 
— possible, but not always wise! For simple shapes such as discs, cylinders 
and spheres, it is generally much easier to use a different approach based 
on non-Cartesian coordinates. This section gives an introduction to the 
three most commonly used non-Cartesian systems: polar coordinates, 
cylindrical coordinates and spherical coordinates. 


3.1 Area integrals in polar coordinates 


Polar coordinates 


Points in a plane are often specified by Cartesian coordinates (x,y), but 
this is not essential. An alternative choice is to use polar coordinates 
(r, @), as shown in Figure 21. 


YA 


rsin@+ 


r COS o) x 
Figure 21 Polar coordinates 


e The radial coordinate r is the distance of the point from the origin 
and lies in the range 0 < r < co. 


e The angular coordinate ¢ is the angle measured anticlockwise from 
the positive x-direction, measured in radians. The value of ¢ is not 
unique because we can add any integer multiple of 27 radians to it, 
and still be describing the same point. We often take ¢ to lie in the 
range 0 < ¢ < 27, but other choices (such as —7 < ¢ < 7) are equally 
valid. 


Using trigonometry in Figure 21, we see that Cartesian and polar 
coordinates are related as follows. 


Sometimes, the polar coordinate 
is denoted by @ rather than @. 
Our present choice is deliberate 
and will have advantages when 
we compare polar, cylindrical 
and spherical coordinates. 
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Figure 22 A Cartesian grid 


Figure 23. A polar grid 


Recall the formula r 6¢ for the 
length of an arc subtended on a 
circle of radius r by the angle 64, 
where 6@ is measured in radians. 
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f= FCO, 9) = sine: (18) 


So if we know the polar coordinates (r,¢@) of a point, we can easily find its 
Cartesian coordinates (x, y). 


Area integrals in polar coordinates 


When an area integral is set up in a given coordinate system, a key step is 
to subdivide the region of integration into a set of tiny area elements. 


In Cartesian coordinates, this is achieved by drawing lines parallel to the 
coordinate axes (Figure 22). Along each horizontal line, x varies at 
constant y. Along each vertical line, y varies at constant x. These lines 
produce a rectangular grid that divides the xy-plane into tiny rectangular 
area elements, such as that shaded in Figure 22. This element has area 


OA = 6x by. 


Something similar is done for polar coordinates. As shown in Figure 23, we 
create a grid from a set of radial lines (spokes) and a set of circles. Each 
spoke is a line along which r increases at constant ¢. Each circle is a curve 
along which @ varies at constant r. Taken together, the spokes and circles 
divide the ry-plane into tiny area elements, such as that shaded in 

Figure 23. 


Figure 24 shows a tiny area element, with its size exaggerated for clarity. 
The element is bounded by radial spokes with ¢ = ¢g and ¢ = ¢9 + 44, 
and circular arcs with r = rp and r = rg + or. The spokes and circular arcs 
meet at right angles to one another, so if the element is extremely small, it 
can be approximated by a rectangle. The sides running along the spokes 
have length 6dr. Because the angle ¢ is measured in radians, the sides 
running round the circular arcs have length rp d@. So the element has area 


dA ~ Or X 19 6b = ro Or 6. 


Figure 24 An area element in polar coordinates (enlarged for clarity) 


3 Using non-Cartesian coordinates 


Of course, there is nothing special about the values r = rg and ¢ = ¢9, so 
we can drop the subscripts and say that an area element centred on (r, ¢) 
has area 


6A ~ ror dd, (19) 


and if we are interested in the area integral of a function f(r,@) over some 
region of the plane, the contribution made by the area element is 


f(r, 0) 6A & f(r, 6) r or 59. 


The area integral of f(r, @) over a given region in the plane is obtained by 
adding contributions from all the area elements that make up the region. 
We do this in the limit of an infinite number of infinitesimally small 
elements. Then our approximations become exact, and at the same time 
the sum becomes an integral. 


As a definite case, let us take the region of integration to be a disc of 
radius R, centred on the origin. Then the limits for the r-integral are r = 0 
and r = R, and the limits for the ¢-integral may be taken to be ¢ = 0 and 
ob = 2n. The area integral can therefore be written as follows. 


Area integral over a disc in polar coordinates 


The area integral of f(r,@) over a disc S of radius R centred on the 
origin is 


[ seoaa= i (/ arc é)rdr) ae, (20) 


o=0 =O 
where we have chosen to integrate first over r, and then over @. 


Take careful note of the factor r that appears in equation (20) — and never 
make the mistake of leaving it out! It occurs in all area integrals based on 
polar coordinates. There are two ways of seeing why this factor must be 
included: 


e Figure 23 shows that the area elements grow in size as we move away 
from the origin. If the area element were simply dr 6¢, this fact would 
not be respected. 


e The expression d6r 6¢ has the dimensions of length (because dr is a 
length and 6¢ is dimensionless). This is not suitable for an area 
element; the extra factor r ensures that the area element has the 
required dimensions of length squared. 


The significance of the area integral in equation (20) is similar to that of 
an area integral in Cartesian coordinates. For example, if f(r,@) is the 
surface density (the mass per unit area) at a point on the disc with polar 
coordinates (r,@), then equation (20) gives the total mass of the disc. 


The reverse order is also valid 
with the inner integral over ¢, 
and the outer integral over r. 
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Figure 25 Some regions of 
integration suitable for polar 
coordinates 


Figure 26 The annular 


region of integration in 
Example 6 
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If f(r, ¢) = 1, then equation (20) gives the area of the disc. This evaluates 


to 
p=2r r=R 
fra] (/ rar) de 
Ss o=0 r=0 


as you would expect. The calculation is much easier in polar coordinates 
than in Cartesian coordinates. This is because when we integrate over a 
disc in polar coordinates, the limits of integration are constants that are 
easily determined. 


A similar simplification occurs for the regions of integration in Figure 25, 
which also have constant limits of integration in polar coordinates. Polar 
coordinates are usually the preferred choice for such regions. This is true 
even if the integrand is initially specified in Cartesian coordinates because 
it can be easily converted to polar coordinates using equations (18). 


Example 6 


Evaluate the area integral of the function f(x,y) = xy? over the annular 
sector S shown in Figure 26. In polar coordinates, S is defined by 
l<r<2and0< 6 < 7/2. 


Solution 


The shape of the region of integration suggests the use of polar 
coordinates. Using equations (18) we have 

zy” = (rcos ¢) x (rsing)? = r? sin? dcos ¢, 
so the required area integral is 


g=n/2 r=2 
[ te.aa- (/ Pin? dcos6 x rdr) dé 
S ) r 


—0 =1 


p= /2 r=2 
= | (/ r' sin? } cos ¢ ar) dd. 
o=0 r=1 


Note that the integrand contains a factor r* rather than r?. The extra 
factor of r comes from the area element in polar coordinates. Carrying out 
the integral over r gives 


o=n/2 = 
[ren dA = | [br] sin? ¢cos ddd 


o=n/2 . 
= 2 | sin* @cos ¢ dé. 
o=0 
The integral over ¢ can be done by noting that the integrand is a product 
of a function of sin ¢ times the derivative of sin ¢ (namely, cos ¢). This 
suggests that we make the substitution u = sin ¢, giving du/ddé = cos ¢. 


3. Using non-Cartesian coordinates 


The ¢ = 0 limit corresponds to u = sin0 = 0, and the ¢ = 7/2 limit 
corresponds to u = sin(7/2) = 1. Putting everything together, we get 


du 
fey aA=% [ 2 ag 
[ =) > Ib=0 do 
u=l1 
-9 u2 du 
u=0 
=Bx}=¥. 


Note: In this example, the integrand is a product of a function of r and a 
function of ¢, and the limits of integration are all constants. In cases like 
this, it is legitimate to split the integral into the product of two integrals 
(just as we did in Cartesian coordinates). We can therefore write 


r=2 p=r/2 
| f(z,y)dA= i, r’ dr x | sin? cos ¢ dd. 
s r fc) 


=1 —0 


w 
par 


Evaluation of these integrals gives the same answer as before: “ x A = iz. 


Exercise 15 


A circular laboratory dish of radius R is covered with bacteria. Relative to 
an origin at the centre of the dish, the surface number density of bacteria | The surface number density is 


is given by the function the number per unit area at a 
given point. This is modelled as 
f(x,y) = se (2R? — 7? y’) a smoothly-varying function. 
’ ? 
R4 


where x and y are Cartesian coordinates, and C is a constant. Find the 
total number of bacteria on the dish. 


Exercise 16 


Use polar coordinates to evaluate the area integrals [ g @dA and i g¥dA, 
where S is the semicircular area shown below. 


YA 
_—R 0 R E 
Exercise 17 


The function f(r, ¢) = e-” is expressed in polar coordinates. Evaluate the 
area integral of this function over the entire xry-plane. 


(Hint: An area integral over the entire xy-plane can be carried out in polar 
coordinates by letting ¢ range from 0 to 27, and r range from 0 to oo.) 
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The alternative term 
cylindrical polar coordinates 
is also used. 


Note carefully: in cylindrical 
coordinates, r is not defined as 
the distance of P from the 
origin O. 


92 


3.2 Volume integrals in cylindrical coordinates 


Cylindrical coordinates are a natural extension of polar coordinates to 
three dimensions. This subsection introduces cylindrical coordinates and 
uses them to evaluate volume integrals. 


Figure 27 shows how the cylindrical coordinates (r,¢, z) of a point P 
are defined. 


Figure 27 A cylindrical coordinate system: (a) coordinates (r, ¢, z); 
(b) relationship to Cartesian coordinates 


e The radial coordinate r is the perpendicular distance from the 
z-axis to P. It ranges from 0 on the z-axis to infinity. 


e The angular coordinate ¢ is the angle (measured in radians) 
between the positive x-axis and the projection of OP onto the 
xy-plane. The allowed range of ¢ corresponds to a complete circuit, 
and may be taken to be between 0 and 27. The sense of increasing @ 
is as shown in the diagram (anticlockwise when viewed from a point 
on the positive z-axis). 


e The axial coordinate z is identical to the z-coordinate of Cartesian 
coordinates. 


In effect, cylindrical coordinates use the Cartesian z-coordinate parallel to 
the z-axis and polar coordinates perpendicular to the z-axis. Using 
Figure 27(b), it is easy to see that a point with cylindrical coordinates 
(r,@, z) has the following Cartesian coordinates. 


L=P7COS@®, Y=rRwm1e©®, F=]s (21) 


To carry out a volume integral in cylindrical coordinates, we must first 
construct a volume element and find an expression for its volume. 

A suitable volume element is shown in Figure 28(a). This is obtained by 
starting at a point P with cylindrical coordinates (r, ¢, z), and making 
small positive increments dr, d¢@ and 6z in each of these coordinates. The 
edges of the volume element are formed by coordinate lines — that is, lines 
or curves along which one cylindrical coordinate increases while the other 
two coordinates have constant values. In Figure 28, the green, red and 
blue curves correspond to small increases in r, @ and z, respectively. 

A wider perspective of these coordinate lines is shown in Figure 28(b). 


3 Using non-Cartesian coordinates 


ZA 
= 00 
= oe 
Oz i 
sS. 
Yy 


Figure 28 A volume element in cylindrical coordinates: (a) a close-up 
view; (b) a wider perspective 


As its dimensions become very small, the volume element approaches a 
cuboid in shape. You can see from Figure 28(a) that this cuboid has sides 


of length ér, ré6¢ and 6z. The element therefore has volume Recall that a circular arc of 
radius r subtending an angle 6¢ 
dV =r or dp dz. (22) (in radians) has length r ¢@. 


Now, suppose that f(r, @,z) is a function of cylindrical coordinates. Then 
the volume integral of f over any given region is approximated by adding 
contributions of the form 


f(r, 0,2) OV = f(r, ¢, z) r 6r 6g dz 


from all the volume elements that make up the region. We do this in the 
limit of an infinite number of infinitesimally small volume elements; then 
our approximations become exact, and at the same time the sum becomes 
a volume integral. If f(r, @, z) is the density (the mass per unit volume) 
inside a given region, this volume integral gives the total mass contained in 
the region. 


To take a definite case, suppose that the region of integration is a cylinder 
of radius R and height h, aligned on the z-axis and with its base in the 
xy-plane (see Figure 29). Then the limits for the r-integral are r = 0 and 
r = R, the limits for the ¢-integral are ¢ = 0 and ¢ = 27, and the limits 
for the z-integral are z = 0 and z =h. The volume integral of f(r, ¢, z) 
over this cylinder can therefore be written as follows. 


y 


Volume integral over a cylinder in cylindrical coordinates 
Figure 29 A cylindrical 


The volume integral of f(r, ¢, z) over the cylindrical region D in senor D 


Figure 29 is given by 


e=l0 o=20 PHI 
i one a — ( (/ FAra@, 2) rar) wo) dz. (23) The factor r inside the integral 
D ro i 


z=0 =0 =0 must not be forgotten! 
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Any multiplicative constants, 
such as m/R°, can be taken 
outside all the integral signs. 
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The brackets show that we have chosen to integrate first over r, then 
over @, and finally over z — but any other ordering would be equally valid 
in this case. For each of the integrations, all variables other than the 
variable of integration are held constant. For example, @ and z are held 
constant when we integrate over r. 


There is nothing unexpected here. The first two integrals, over r and ¢, 
correspond to integrating over a planar region in polar coordinates, with z 
treated as a constant. Only the final integration over z is new. 


Example 7 
A cylinder of height h and radius R has its central axis along the z-axis, 
with its base in the ry-plane as in Figure 29. The density of this cylinder 
is given by 
m / 2 2 2 
F(z, Y; 2) = et ry +z ; 


where x, y and z are Cartesian coordinates, and m is a constant. Find the 
total mass of the cylinder in terms of m, h and R. Evaluate your answer in 
the special case where h = 2R. 


Solution 


The total mass of the cylinder is given by the volume integral 


M= f(x,y, z) dV. 


cylinder 


Because the region of integration is cylindrical in shape, cylindrical 
coordinates are the natural choice. We therefore need to express the given 
density function in cylindrical coordinates. Using equations (21), we get 


(0.6.2) = Ter? cos? 6 +1? sin? 6 + 2?) = (0? + 2”), 


where, as usual, we have used the same symbol f for the density function 
irrespective of the coordinate system. 


The total mass is then given by the integral 


7 z=h o=2n r=R Mm,» ; 
M= ~ (a (/. ag + 2)rdr) ao) dz 

m z=h p=20 r=R . ' ) 
ee . r | 
He Ls Uf, Cf, (r° + 2°r) dr ) do } dz 


Integrating over r and applying the limits of integration gives 


m z=h o=20 
M=" I, (2 (1R4 + 1R?2?) wo) dz. 


The integration over ¢ is easy. It gives a factor of 27, which can be taken 
outside the integral. So 


M = (LR! + R22) dz. 


3 Using non-Cartesian coordinates 


Finally, the integral over z gives 


27m 


4 1 p273 
M= RE (4 Rh + gh ip 
In the special case where h = 2R, the mass of the cylinder is 
27m 5 5 
Seca (5 R +4R ) = inm. 


It is sensible to use cylindrical coordinates for all shapes based on cylinders 


— such as hollow cylinders or segments of a cylinder. Of course, the limits 
of integration must be adjusted for each particular case. 


Exercise 18 


A hollow cylinder, with its central axis of symmetry along the z-axis, has 
inner radius 2 and outer radius 5. The two flat ends of the cylinder are at 
zg=-—land z=+1. Find the volume integral of the function 

f(r, ¢,z) = rz? over the volume of this hollow cylinder, where r is the 
distance from the z-axis. 


Volumes with axial symmetry 


Consider the shape in Figure 30. This shape is unchanged if we rotate it 
through any angle around the red axis. Such a shape is said to have axial 


symmetry, and the red axis is called the axis of symmetry. If a volume 


integral is over a region with axial symmetry, it is generally advisable to 
use cylindrical coordinates rather than Cartesian coordinates, with the 
z-axis coincident with the axis of symmetry. 


We now consider calculating the volumes of objects with axial symmetry. 
In general, the limits of integration for these volumes are not all constants, 
but axial symmetry is a very useful simplifying feature: it means that the 
limits of integration of the r- and z-integrals cannot depend on @. So the 
volume V of any axially-symmetric region can be expressed in cylindrical 
coordinates as 


o=2T 2=22 r=Tmax(Z) 
v=/ / ‘ 1x rdr | dz | dé. (24) 
o=0 z=2 r=Tmin (2) 


Here, the functions rpin(z) and rmax(z) give the minimum and maximum 
values of the radial coordinate r at a given value of z. If the object is 
hollow around the z-axis, then rmin(z) is non-zero for at least some values 
of z, but a solid object has rpin(z) = 0 for all z. The values z = z; and 
z= z are the minimum and maximum values of the z-coordinate in the 
object. 


We can complete the inner integral over r in equation (24) to obtain 


v= ~ ( fs Raxl2) - Pin) iz) dd. (25) 


=f S21 


——|———— 


Figure 30 A shape with 
axial symmetry relative to 
the red axis 


Recall that in cylindrical 
coordinates, the radial 
coordinate is the distance from 
the z-axis. 
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Figure 31 A cone of 
height A and base radius a 


Figure 32 A cross-section 
through the cone 
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The remaining limits of integration are all constants, so we can reverse the 
order of the integrals to get 


2=22 o=20 
= Lie leaped 
V= i. (a 2 ( nae ) eat )) ao) dz 


Integrating over ¢ and taking constants outside the remaining integral, we 
get the following result. 


Volume of an axially symmetric object 


Ven | (Phas(2) —hin(2)) de (26) 


=Z1 


In the special case of a solid object, rmin(z) = 0 for all z, so 


2=29 
V= a 1 ax (Z) az. (27) 
2=2] 

This formula can be interpreted by imagining that the object is made up 
of many thin discs, each of thickness 6z, stacked one on top of the other. 
For a given value of z, the appropriate disc has radius rmax(z), area 
Tr2..x(z) and volume 7r?,,,.(z) dz. Adding together the volumes of all the 
discs and taking the limit as dz tends to zero, we recover equation (27). 


The following example shows how this result is used. 


Example 8 


Use cylindrical coordinates to find the volume of the cone in Figure 31. 
This cone has height h and base radius a; its axis of symmetry is the 
z-axis, and its base lies in the xy-plane. 


Solution 


Figure 32 shows a cross-section through the central axis of the cone. At a 
given value of z, the surface of the cone has radial coordinate r = rmax(z), 
as shown in the figure. From similar triangles we see that 


h-z h a (1 ®) 
——-~ =-, 80 Tmax(z) = =e 5 

Toate). @ o h 
Using equation (27), the volume of the cone is 

z=h 2 
Vand | (1-=) dz. 
z=0 h 

This integral can be done in a variety of ways. We choose to make the 
substitution u = 1 — z/h, so that z = h — hu. Then dz/du = —h, and we 


can make the replacement dz = —hdu. The limits z = 0 and z=h 
correspond to u = 1 and u = 0, respectively. So the integral becomes 


u=0 = 
v= na? | u?(—hdu) = —na*h [fu] = = 3narh. 
u=1 


3. Using non-Cartesian coordinates 


Exercise 19 


A rugby ball has an axis of symmetry along the z-axis, as shown in the 
margin. In cylindrical coordinates, its surface can be modelled by the 
equation 


Tmax = Os/1— 27/0", 


where a and 0 are positive constants. The smallest and largest values of z 
on the surface of the ball are z = —b and z = b. Use equation (27) to find 
the volume of the ball. 


Exercise 20 


A sphere of radius R, centred on the origin, is sliced across by a horizontal 
plane z = R/2. The part of the sphere above the plane is a spherical cap. 
What is the volume of this spherical cap? 


3.3. Volume integrals in spherical coordinates 


Finally, we discuss spherical coordinates, which are used extensively The alternative term spherical 
throughout the physical sciences. Figure 33 shows how the spherical polar coordinates is also used. 
coordinates (r,6,@) of a point P are defined. 


Figure 33 A spherical coordinate system: (a) coordinates (r, 0, d); 
(b) relationship to Cartesian coordinates 


The radial coordinate r is the distance of the point from the 

origin O. It ranges from 0 at the origin to infinity, and is never 
negative. Note that this radial coordinate is not the same as the radial 
coordinate in cylindrical coordinates. It is therefore always important 
to state which coordinate system is being used. 


The polar angle 0 is the smaller of the angles (in radians) between 
the positive z-axis and the line OP. It ranges from 0 along the 
positive z-axis to 7 along the negative z-axis. 


The azimuthal angle ¢ is the angle (in radians) between the positive 
x-axis and the projection of OP in the xy-plane. It increases in the 
sense shown, and is the same as the angular coordinate ¢ in cylindrical 
coordinates. The allowed range of ¢ corresponds to a complete circuit, 
and may be taken to lie between 0 and 27. 
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Recall that sin? ¢ + cos? ¢ = 1 
and cos(20) = cos? 6 — sin? 6. 
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At first sight, it may seem surprising that 6 does not range from 0 to 27. 
To see why this is so, consider the Earth with 0 = 0 at the North pole and 
§ = 7 at the South pole. Then it is clear that all latitudes are covered by 
letting 0 range from 0 to 7, and all longitudes are covered by letting @ 
range from 0 to 27. If we gave @ a range larger than 0 < 0 < 7, we would 
be in danger of ‘double-counting’ in volume or surface integrals. 


Spherical coordinates (r,0,¢) can be related to Cartesian coordinates 
(x,y, Z) using the trigonometry of right-angled triangles in Figure 33(b). 


f=rsmdcose, yeorsm@sme, 2—=rcose. (28) 


These equations can be used to express any function of x, y and z in terms 
of r, 9 and ¢. For example, 
2? — (2? + y”) =r? cos? 6 — (r? sin? 6 cos’ @ + r? sin? 6 sin? ¢) 
= r? cos” 6 — r? sin? 0(cos” ¢ + sin” ¢) 
= r?(cos” 6 — sin? 0) 


r? cos(20). 


To carry out a volume integral in spherical coordinates, we need to identify 
an appropriate volume element and find an expression for its volume. The 
required volume element is shown in Figure 34(a). This is obtained by 
starting at a point P with spherical coordinates (r,6,¢), and making small 
positive increments dr, 60 and 6¢ in each of these coordinates. 


e The edge PQ corresponds to an increase in r (with 0 and ¢ constant). 
e The edge PR corresponds to an increase in 6 (with r and ¢ constant). 
e The edge PS corresponds to an increase in ¢ (with r and @ constant). 


A wider perspective is shown in Figure 34(b), where the curves along 
which a single spherical coordinate varies are shown in green, blue and red 
for r, 0 and ¢, respectively. 


O a 
(a) 


Figure 34 A volume element in spherical coordinates: (a) a close-up 
view; (b) a wider perspective 


3 Using non-Cartesian coordinates 


As its dimensions become very small, the volume element approaches a 
cuboid in shape, so its volume 6V is given by 


6V ~ PQ x PR~x PS. 


The length of PQ is simply dr. In Figure 34(a), PR is an arc of a blue 
circle of radius r. This arc is generated by an angular change 60 in 0, so its 
length is r 60. Finally, PS is an arc of a red circle. Using the trigonometry 
in Figure 34(a), you can see that the radius of this circle is rsin@. The arc 
PS is generated by an angular change d¢ in @, so its length is r sin 0 d¢. 
We therefore have 


OV ~ or x 760 X rsin 6 6b = r? sin 6 dr 60 5¢. (29) 


The volume integral over a given region is then found in the usual way: we 
cover the region with tiny volume elements, then take the limit as the 

number of elements increases and the volume of each element tends to zero. 
A sum over volume elements then becomes a volume integral, with limits of 
integration appropriate for the given region. Here is the result for a sphere. 


Volume integral over a sphere in spherical coordinates 


The volume integral of f(r,0,) over a spherical region of radius R, 
centred on the origin, is given by 


p=2n 6=n PHI 
= | ¢ ( f(r, 9,0) r2 sin ddr) a) dd. (30) Do not forget the factor r? sind 
co) 0 r 


=O =(0 =C inside the integral. It appears in 
all volume integrals based on 


This result is easily adapted to other regions by changing the limits of spherical cpordinaves, 


integration. For example, a hollow spherical shell corresponds to taking 
Ry <r < Rg, and a hemisphere with z > 0 is obtained by taking 

0 <6< 7/2. Spherical coordinates are particularly useful when the limits 
in all three integrals are constants. 


If the function f(r,@,¢) represents density (the mass per unit volume), 
then the volume integral is the total mass in the given region. If 
f(r,6,¢) =1, then the volume integral is the volume of the region. For 
example, the volume of a hollow spherical shell with inner radius Ry and 
outer radius Ro is given by 


d=2r 6=n r=R2 
V= / (/ (/ r? sin dr) i) dd. 
¢=0 6=0 r=R, 


As always, we work from the inside outwards. The first integration is 
over r, with 6 and ¢ held constant. This gives 


o=20 6=7 or =P 
V= [sr sin 6] Rp, dé | dd 
o=0 6=0 
3 p38 po=2r 0=n 
= = | ¢ sind) dd. 
3 $=0 6=0 
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This integral can be confirmed 
by differentiating the result. 
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The remaining angular integrals are easily done. We get 
R3 _ R3 p=2n 6= 
y== | —cos6])" dd 
3 $=0 [ | 6=0 
R3 = R3 o=20 
= 44 | 2d = $n(R3 — Rj). 
3 =O 


This is the difference between the volume of a solid sphere of radius Ry 
and the volume of a solid sphere of radius R;, just as you would expect. 
The calculation is a good advertisement for using spherical coordinates in 
spherically symmetric situations; going through the lengthy chore of 
calculating the volume of a sphere in Cartesian coordinates is not the 
smart option! 


The ordering of the integrals in equation (30) implies that we integrate 
first over r, then over 0, and finally over ¢. However, if the limits of 
integration are all constants — as they are in equation (30) — we can write 
the integrals in a different order and still get the same answer. The choice 
of order can sometimes affect the ease of the integrations. 


Example 9 


In spherical coordinates, a function of position takes the form 


ee : 


(a2 + r2 — 2ar cos @)!/2’ 


where a is a positive constant. Integrate this function over the volume of a 
sphere of radius R < a, centred on the origin. 


(Hint: You may find the following integral useful: 


sin 6 1 

eee = (0 + — Dab con)? +6, 
/ (a2 + b2 — 2abcos @)!/2 ab! ) 

where a and b are constants, and C is an arbitrary constant of integration.) 

Solution 


In spherical coordinates we write the volume integral in the form 


r=R o=20 6=1 Deen 
i | ( | ( | _—_  __ ony a) wo) dr, 
r=0 ¢=0 g-o (a2 +r? — 2arcos6@)!/2 


where the factor r?sin@ comes from the expression for the volume element. 


Here, we have chosen to integrate over 6 first, then over ¢, and finally 
over r. Our motivation for tackling the integrals in this order is that the 
task of integrating f(r,9,¢) over r looks tough. We integrate over the 
angles first, in the hope that things will get easier! 


Using the symbol J to denote the integral over 0, we have 
=f r? sind 0 
Igo (a2 + r2 — 2arcos6)!/2 


Holding r constant and using the integral given in the question with b =r, 
we get 


3 Using non-Cartesian coordinates 


6=7 


~ 
| 
|“ 


(a? + r? — 2ar cos ay? 
6=0 


(a? + r? + 2ar)¥/? — (a2 +7? — 2ar)!/2) 
. (((a + ry)? = ((a _ r)y¥?) 


Within the sphere, we know that 0 < r < R, and the question tells us that 
R<a,so we have 0 <r<a. Hence the appropriate square roots are 


eye =a-—r?, 


((at+r)?)/%=a+r and ((a— 
giving 

r Qr? 

i= (a+r) - (a—r)) = 7 


Our gamble has paid off: the integration over 9 has simplified things 
considerably. The volume integral then becomes 


r=R o=2n 9) 2 r=R 4 2 14 3 
r= | (/ ao) dr = f ei ey 
r=0 \J¢=0  @ r=0 a 3 


A groundbreaking discovery 


Using some additional arguments, physicists can use the result of 
Example 9 to show that the gravitational effect of a uniform sphere of 
mass /, measured at any point outside the sphere, is the same as 
that of a particle of mass M placed at the centre of the sphere. 


This discovery was of great historic importance. As early as 1666, 
Isaac Newton took the gravitational effect of a sphere as being 
equivalent to a particle placed at its centre, but initially assumed this 
to be a crude working approximation. Nearly twenty years later, in 
1685, he finally succeeded in proving the fact. (Newton’s calculation 
followed a slightly different route to that given here, but the physical 
conclusions are the same.) We know from his own words that he had 
no expectation of so beautiful a result until it emerged from his 
mathematical investigation. The discovery was groundbreaking; it 
removed the barrier to precise astronomical calculations, and the next 
year Newton felt able to publish his masterpiece, Philosophiae 
Naturalis Principia Mathematica (Figure 35). 


Frequently, the limits of integration are constants and the function to be 
integrated is a product of single-variable functions of r, 6 and ¢: 


f(r, 0,6) = u(r) v() w(¢). 


Under these circumstances, we can write the volume integral in 
equation (30) as the product of three ordinary integrals: 


r=R 6=n p=2T 
=) u(r)? drx f v(6)sinead xf ; w(¢) dd. 


=0 6=0 o=0 


PHILOSOPHLE 


MAS TLR ABTS 


IPRINCIPIA 


MATHEMATICA. 


Autore 7S. NEWTON, Trin. Coll. Cantab. Soe. Mathefeos} |. — 


Profelore Lucafiano, & Societatis Regalis Sodali. 


IMPRIMATUR-: 
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LONDINE 
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Figure 35 The title page of 


Newton’s masterpiece, 
usually known as the 
Principia 


iat dts 
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Example 10 


Find the mass of a sphere of radius R, centred on the origin, with a density 
function given by f(a, y, z) = mz?/R°, where m is a constant. Convert 
from Cartesian coordinates to spherical coordinates before carrying out an 
appropriate volume integral. 


Solution 


Using equations (28), the density function in spherical coordinates is 
m 
JT 0,.e) = Bw r? cos” 0, 


so the mass of the sphere is 


p=2n 6=1 r=R mr2 
M= (/ (| Re cos? 6 x r? sind dr) io) dd. 
o=0 0=0 r=0 


The integrand is a product function, and the limits of integration are all 
constants, so the volume integral can be split into a product of three 
ordinary integrals: 


m o=20 6=n r=R 
Ma f ao x | cos? asin 9 x | r* dr. 
6 


RP Jg—0 =0 r=0 
The integral over 6 can be done by making the substitution u = cos 0. 
Then du/d@ = — sin 0, and the new lower and upper limits are u = 1 and 
u = —1, respectively. Hence 


6=T 6=T du u=—1 
| cos? asin do = [ u (-5) a= — | u? du = 3. 
é=0 @=0 dé u=1 


The remaining integrals over ¢ and r are easily done. Collecting 
everything together, we get 


m 
a 25 1pd — 4 
M= eX 2nx 3x sh = TM. 


Exercise 21 


The function 
sin 0 


f(r, 8,6) = —— (r 4 0) 


is expressed in spherical coordinates. Find the volume integral of this 
function over the region between two concentric spheres, centred on the 
origin, and of radii r = 1 and r = 2. 


Exercise 22 


(a) Given a function f(r), where r is the distance from the origin, show 
that the volume integral of f(r) over a sphere of radius R, centred on 
the origin, can be expressed as 


| as fdV =4n / " f(r) r? dr. 
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(b) Use your answer to part (a) to find the volume integral of 


G(@, 4,2) = Va? + y? + 2 


over a sphere of radius R, centred on the origin. 


4 A review of coordinate systems 


You have now met several coordinate systems: 

e Cartesian coordinates in two and three dimensions 
e polar coordinates in two dimensions 

e cylindrical coordinates in three dimensions 

e spherical coordinates in three dimensions. 


In each case, we defined area or volume elements, and used these elements 
to calculate area and volume integrals. All of these area and volume 
elements can be treated from a unified point of view, using the concept of 
scale factors. The basic concept of a scale factor is explained in 
Subsection 4.1, while Subsection 4.2 gives further details. 


4.1 Orthogonal coordinate systems and scale 
factors 


In any given coordinate system, we can define lines or curves along which 
just one coordinate varies, while the other coordinates remain fixed. Such 
lines or curves are called coordinate lines. 


For example, in polar coordinates (r,@), there are r-coordinate lines and 
g-coordinate lines (Figure 36). The r-coordinate lines are shown in blue: 
these are radial lines along which r varies and ¢ has a constant value. The 
o-coordinate lines are shown in orange: these are circles around which @ 
varies and r has a constant value. 


Each coordinate has a corresponding coordinate line. In Cartesian 
coordinates, the x-, y- and z-coordinate lines are all straight lines, parallel 
to the axes. In polar, cylindrical and spherical coordinate systems, at least 
one type of coordinate line is not straight; for this reason they are often 
described as being curvilinear coordinate systems. 


All the coordinate systems discussed so far share an important property. 
In each system, the coordinate lines corresponding to different coordinates 
meet at right angles. For example, in Cartesian coordinates, the x-, y- and 
z-coordinate lines are perpendicular to one another. In polar coordinates, 
the radial r-coordinate lines are perpendicular to the circular ¢-coordinate 
lines, and so on. Coordinate systems with this property are said to be 
orthogonal. 


g-coordinate line 
r-coordinate line 


Figure 36 Coordinate lines 
in polar coordinates 
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Figure 37 An area element 
in polar coordinates (r, ¢) 
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Orthogonal coordinate systems 


A coordinate system is said to be orthogonal if its coordinate lines, 
corresponding to different coordinates, meet at right angles. 


Cartesian, polar, cylindrical and spherical coordinate systems are all 
orthogonal. 


Now consider the process of forming an area or volume element in an 
orthogonal coordinate system. First, we review the argument for polar 
coordinates. 


Figure 37 shows a tiny area element in this coordinate system at a point P 
with polar coordinates (r,¢). This element has adjacent sides PQ and PR, 
and its size has been exaggerated for clarity. Note that PQ is part of an 
r-coordinate line, and PR is part of a ¢-coordinate line. Because the polar 
coordinate system is orthogonal, PQ and PR meet at right angles. This is 
a great simplifying feature because it means that the tiny area element can 
be approximated by a rectangle. Such an element has area 


5A = PQ x PR. 


The lengths PQ and PR are easily found from Figure 37. We have 
PQ = o6r and PR =r 6d, so we conclude that 


6A =r or od. 


Note that this area element is not just the product of the coordinate 
increments dr and 6¢. The angular increment d¢ is dimensionless, and it 
must be multiplied by the factor r to produce the length PR = rd6@. That 
is why the formula for 6A contains a factor of r. 


Now consider a more general case. Suppose that we have a 
two-dimensional coordinate system with coordinates (u,v). To keep the 
argument general, we do not specify the nature of these coordinates — they 
could be polar coordinates (r,¢@), or some other choice, but we do insist 
that the coordinate system is orthogonal. This means that the u- and 
v-coordinate lines (the curves along which just one coordinate varies) meet 
at right angles. 


Starting from a given point with coordinates (u,v), we can make a small 
increment in u, with v held constant. This small increment generates a 
small step along the u-coordinate line. However, the length of this step 
need not be equal to du. You saw this in the case of polar coordinates, 
where an increment 6¢ generates a step of length r dé. To deal with this 
point in a general way, we introduce the concept of a scale factor. 


4 A review of coordinate systems 


Scale factors 


For any coordinate u, the length of the segment of the u-coordinate 
line between u and u + du, where du > 0, is expressed as 


length of segment = h,, du, (31) 


where h,, is called the scale factor for the u-coordinate; this may be 
a function of the coordinates. 


For example, the scale factors for polar coordinates (r,@) are hy = 1 and 
hg =r, corresponding to the segment lengths dr and ro¢. 


Using scale factors, we can write down a general expression for the area of 
an area element in any orthogonal coordinate system. An area element at 
a point (u,v) is produced as follows. Starting from the point (u,v) in 
Figure 38, we step out along the u-coordinate line until u has increased to 
u+ du. We also step out along the v-coordinate line until v has increased 
to v+ dv. This gives two adjacent sides of the area element. From the 
definition of scale factors, we know that these sides have lengths h,, du and 
hy dv, respectively. However, the coordinate system is assumed to be 
orthogonal. This means that the coordinate lines meet at right angles, and 
the area element can be approximated by a rectangle whose area is given 
by multiplying the lengths of two adjacent sides. We therefore reach the 
following conclusion. 


Area element in orthogonal coordinates 
In any orthogonal coordinate system (u,v), an area element has area 
6A = hyhy du dv, (32) 


where h, and h, are appropriate scale factors. 


Similar ideas apply to orthogonal coordinate systems in three dimensions. 
There are now three coordinates, (u,v, w). Since the coordinate system is 
assumed to be orthogonal, the u-, v- and w-coordinate lines meet at right 
angles, and a tiny volume element can be approximated by a cuboid. The 


volume of this element is given by multiplying the lengths of three adjacent 


sides. By the definition of scale factors, these lengths are h, du, hy dv and 
hy dw. So we have the following result. 


Volume element in orthogonal coordinates 


In any orthogonal coordinate system (u,v, w), a volume element has 
volume 


OV = hyhyhy du dv ow, (33) 


where h,,, hy and hy, are appropriate scale factors. 


v-coordinate line 


u-coordinate line 


Figure 38 An area element 
in a general orthogonal 
coordinate system (uw, v) 
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The products h,hy in two dimensions, and h,hy,hy in three dimensions, are 
called Jacobian factors. They occur wherever an area or volume integral 
uses orthogonal coordinates. Do not make the mistake of leaving them out! 


Equations (32) and (33) apply in all orthogonal coordinate systems, but to 
use these equations in a given coordinate system, you need to know the 
scale factors. There is a trivial case: in Cartesian coordinates, all the scale 
factors are equal to 1, and the corresponding area and volume elements are 
0A = 6x dy and 6V = 6x oy oz. For other coordinate systems, we can use 
the results for area and volume elements derived in previous sections to 
compile a list of all the scale factors that we need. 


Scale factors in some orthogonal coordinate systems 

hho (34) 
Cylindrical coordinates (7,¢,2) hp— li hg=7, h,— 1 (35) 
Spherical coordinates (r, 0, ) fo — et — oT ine (36) 


Polar coordinates (r, ¢) 


The scale factors for polar coordinates correspond to the area element in 
Figure 37. For reference purposes, the volume elements in cylindrical and 
spherical coordinates are reproduced in Figure 39, and you can see that 
these correspond to the scale factors in the list above. 


A ZA 


24.....66 


r 


: bz i 


(a) (b) 


Figure 39 Volume elements in (a) cylindrical coordinates and 
(b) spherical coordinates 


Exercise 23 


Use the scale factors in equations (35) and (36) to write down formulas for 
volume elements in cylindrical and spherical coordinates. 


Why should we bother with scale factors? The main reason is conceptual — 
they provide a unified language for discussing area and volume integrals. 


4 A review of coordinate systems 


Apart from this, the concepts of orthogonal coordinate systems, coordinate 
lines and scale factors are all used later in this book. They reappear in 
contexts other than integration, but the introduction given here provides a 
good foundation. 


4.2 Another way of calculating scale factors 


This subsection derives a neat formula (equation (39)) that can be 
used to find scale factors without drawing diagrams or using 


trigonometry. The formula is used later, but its derivation will not be 
assessed. 


Suppose that we have a coordinate system (u,v, w), and we know the 
relationship between Cartesian coordinates (x,y, z) and (u,v, w). For 
example, in spherical coordinates (r,6,¢), we know that 


x=rsinécos¢, y=rsindsing and z=rcosé. 
Then we can use this information directly to find the scale factors. 


Figure 40 shows the effect of stepping out along the u-coordinate line by 
making a small increment du > 0 in the coordinate u. We move from a 
point P with coordinates (u,v, w), to a point Q with coordinates 
(u+du,v,w). The displacement vector between P and Q is denoted by a. 
The magnitude of this vector is the distance between P and Q, which, by 
definition, is equal to hy, du. So we have 


la] = hy, du. (37) 
The displacement vector a can be written in Cartesian coordinates as 
a= dxi+ dyj+ozk, 


where 6x, dy and dz are the changes in Cartesian coordinates between P 
and @. However, the chain rule tells us how a small change in z is related 
to small changes in wu, v and w: 


Ox Ox Ox Ox 
bu = Bout a Out a dw = a ou, 


where the last step follows because év = 0 and dw = 0 along the 
u-coordinate line. Of course, there are similar expressions for dy and 6z, so 
we conclude that 


Ox, Oy, Oz 
am (Fin i+ Ek) bu (38) 
Since du > 0, the magnitude of a is given by 
dx \? Oy : dz\? 
lal = (5) + (34) + (3) Ou. 


Comparing this with equation (37), we get the following general formula 
for the scale factor hy. 


u-coordinate line 


(u + du, v, w) 


Figure 40 A small 
displacement along the 
u-coordinate line 


This chain rule was introduced 
in equation (19) of Unit 7. 
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Formula for a scale factor 
dx \? Oy é az\? 
= ima) * (bu) * Ga) oo 


Similar formulas apply for h, and hy», but with partial derivatives taken 
with respect to v and w, respectively. For a two-dimensional coordinate 
system (u,v) in the xy-plane, we use essentially the same formula, but 
with the final term left out. 


Example 11 


Use equation (39) to find the scale factors for polar coordinates (r, ¢), 
which are related to Cartesian coordinates by 


x=rcos¢d, y=rsing. 
Solution 


Taking partial derivatives of x and y with respect to r and ¢, we have 


wilh =Prcos@. 


ao 


= sin @, =-—rsing, 


Or 


Hence the scale factors are 


hy = \/cos? + sin? ¢ = 1, 
he = V(—rsin ¢)? + (rcos ¢)? =r. 


These values agree with those in equation (34). 


Ox 
do 


Exercise 24 


Use equation (39) to find the scale factors for spherical coordinates 
(r,9,¢), which are related to Cartesian coordinates by 


x=rsin@écos¢, y=rsinésing, z=rcosé. 


5 Surface integrals 


This section considers integrals over surfaces that are not flat. For 
example, we might know the surface density of paint at each point on the 
surface of a car. This surface density tells us the mass of paint per unit 
area in the vicinity of any given point on the surface of the car. If we 
integrate the surface density over the curved surface of the car, we will get 
the total mass of paint on the car. But how do we integrate over a curved 
surface? 
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5.1 The surface area of a sphere 


As a simple example, consider the surface area of a sphere of radius R. 
You may know that this surface area is 47R?, but it is worth seeing where 
this comes from. 


We choose our origin to be at the centre of the sphere, and set up spherical 
coordinates (r,@,). We need only two coordinates (0, ¢) to specify any 
point on the surface of the sphere (because all such points have r = R). 
We sometimes say that the surface of the sphere is parametrised by the 
coordinates @ and @¢. 


We can draw @- and ¢-coordinate lines on the surface of the sphere, and 
these subdivide the surface into a large number of surface elements 
(Figure 41). With a very fine subdivision, each tiny element can be 
approximated by a rectangle with sides of length hg d@ and hg 6¢, where 
he = R and hg = Rsin@ are the scale factors for spherical coordinates. 
Hence the area of an element centred on coordinates (0, ¢) is 


6A = hghg 605 = R? sin 050 5¢. (40) 


To find the total surface area of the sphere, we must add the areas of all 
the surface elements. We do this in the limit of vanishingly small elements, 
so that the summation is achieved by integration. The values of 6 and ¢ 
cover the ranges 0 < 6< 7 and 0 < ¢ < 27, so the total surface area of the 
sphere is 
o=2T0 
surface area = | 


6=7 
( R? sin 6 i) do 
d=0 


0=0 


as expected. 


This surface area is the integral over the spherical surface of the function 
f = 1. We can also integrate other functions over this surface. Suppose 
that the sphere is unevenly coated with a layer whose surface density (i.e. 
mass per unit area) is 


£(8,) = Acos” 4, 
where A is a positive constant. Then the total mass of the layer is 


o=20 6=1 
M= ( | f(0,¢)R? sind) do 
6 


o=0 =0 


o=20 6=1 
= AR? | ( | cos” asin dé) dé. 
¢=0 6=0 


The angular integrals over @ and ¢ were calculated in Example 10, and 
give factors of > and 27, respectively, so we get 


M = AR? x 2 x In = $n AR’. 


5 Surface integrals 


Figure 41 The surface of a 

sphere is divided into surface 
elements by a grid formed by 
6-coordinate lines (blue) and 
¢-coordinate lines (red) 


Remember that R is constant 
for a given spherical surface. 


109 


Unit 8 Multiple integrals 


In practice, scientists and 
engineers do not spend much 


time evaluating surface integrals. 


They either consider simple 
surfaces, such as spheres or 
cylinders, or use numerical 
methods. 


Figure 42 A surface grid 
formed by intermeshing 
coordinate lines 
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Exercise 25 


A sphere of radius R is centred on the origin. Find the surface area of the 
spherical cap formed by the portion of the sphere that has z > R/2. 


5.2 A general method for surface integrals 


The calculation of surface integrals on the surface of a sphere works well 
because points on the surface of a sphere can be labelled by the angular 
coordinates @ and ¢ of spherical coordinates. This coordinate system is 
orthogonal, so it generates area elements on the surface of the sphere that 
can be approximated by tiny rectangles. We know the relevant scale 
factors hg and hg in this case, so it is fairly easy to obtain equation (40) 
for an area element. 


On a general surface, we have to work harder to get suitable 
expressions for the area elements. This final subsection develops a 
general method for doing this, summarised by equations (43) and (44) 
below. You may be asked to apply these results, but the arguments 
leading up to them will not be assessed. 


Let us assume that the surface under investigation is parametrised by 
coordinates (u,v). This means that each allowed pair of values (u,v) labels 
a unique point on the surface. As u and v vary over their allowed ranges, 
the entire surface is mapped out. You have seen how this works for the 
surface of a sphere, which is parametrised by the coordinates 6 and ¢, with 
0<0<7and0< @¢< 2z. 


Each point on the surface can also be represented by Cartesian coordinates 
(x,y,z). So there must be some link between Cartesian coordinates and u 
and v. On the surface, 7, y and z will be given by specific functions 


2=n(u,v), y=y(wyev), £=2(u,v). (41) 


For example, in the case of a sphere of radius R, centred on the origin, and 
parametrised by 6 = u and ¢ = v, these equations take the form 


x=Rsinécos¢, y=Rsindsingd, z= Rcos8, (42) 
where R is a constant for a given sphere. 


On an arbitrary surface, the u-coordinate lines intermesh with the 
v-coordinate lines to produce a grid of tiny surface elements (Figure 42). 


In general, the u- and v-coordinate lines do not meet at right angles, and 
the surface elements are approximated by tiny flat parallelograms, rather 
than rectangles. We need to find the areas of these parallelograms, one of 


which is shown greatly enlarged in Figure 43. 


(u,v + dv) 


v-coordinate line 


(u + Ou, v) 


u-coordinate line 


(u, v) 


Figure 43 An area element can be approximated by a tiny parallelogram, 


shown greatly enlarged here 


Using equation (34) from Unit 4, we have 


area of parallelogram = |a x bj, 


where a and b are the displacement vectors shown in Figure 43. We also 
know from Unit 4 that the vector product a x b can be expressed as a 


determinant: 
ij k 
axXb=|az Gy az|. 
Dz Dy. Dy 


However, we obtained an expression for the vector a in Subsection 4.2. 


Equation (38) tells us that 


Ox Oy Oz 
a= (dy, Ay, az) = Bu’ Bu’ Ou , 


and by a similar argument, 


Ox Oy Oz 
b = (bx, by, bz) = (= Bu’ =) U. 


Combining these results with the preceding equations, we obtain the 
following general result for the area of a surface element on a curved 
surface. 


5 Surface integrals 


See Unit 4, equation (61). 
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Note that the vertical lines in 
equation (43) indicate the 
magnitude of a vector, while the 
vertical lines in equation (44) 
indicate a determinant. 


You will need to use the identity 
cos? x + sin? x = 1. 


Equation (46) assumes that all 
the limits of integration are 
constants. This covers all the 
cases considered in this module. 


1 


Area of a surface element on a curved surface 


If a surface is parametrised by coordinates (u,v), where x = x(u, v), 
y = y(u,v) and z = z(u,v), then the area of a surface element is given 
by 


6A = |J| du du, (43) 
where |J| is the magnitude of the vector 
ij k 
de dy dz 
J=/du du dul: (44) 
eS ee 
Ov Ov Ov 


The vector J is sometimes called the Jacobian vector. 


The magnitude of J determines the area of a surface element. The 
direction of J also has a simple interpretation. Because the vector product 
a x b is perpendicular to both a and b, and because a and b in Figure 43 
lie the plane of the surface, the vector J is perpendicular to the surface. 


Before using equations (43) and (44) more generally, let us just check that 
they give the result obtained earlier for the surface element of a sphere, 
parametrised by @ and ¢ of spherical coordinates. In this case the relevant 
partial derivatives were calculated in Exercise 24. Setting r = R, we get 
i j k 
J=|Rcos@cos@ Rceos@ésind —Rsiné). (45) 
—Rsinésing Rsindcos@ 0 


Exercise 26 


Expand the determinant in equation (45), and hence confirm that the area 
of a surface element on a sphere is given by 6A = R? sin 060 50. 


Using the surface element 6A = |J| du dv, it is easy to write down a general 
expression for a surface integral. 


Surface integral over a curved surface 


For a surface S parametrised by coordinates (u,v), the surface 
integral of a function f(u,v) over S' is given by 


frm [Samoan 


where the ranges uy <u < ug and vj < v < ug are chosen to cover the 
surface S exactly. 


5 Surface integrals 


To find the area of a surface S, we use equation (46) with f = 1. 


Example 12 ay 


= 


A cone has height h and base radius a (see Figure 44). The sloping surface 
of this cone can be parametrised by two of the cylindrical coordinates, r 
and ¢, with O<r<aand0< @¢< 2m. In terms of these parameters, 
points on the sloping surface of the cone have Cartesian coordinates 


r=rcos¢d, y=rsing, 2=h(1-“). 
a 


Find the area of the sloping surface of the cone (not including its base). 5 > 
ay 


Solution 
Figure 44 Cross-section 


Taking partial derivatives of x, y and z with respect to our chosen through a cone 


parameters r and @ gives 


Or ogy, Yesmg, 22 -—2 
Bp Bp ae 
O 0 a) 
3g = ~Tsinds 9g = Tost age 
Using these results in the expression for the Jacobian vector J, we get 
i j k 


J=]| cos@¢ sind —h/a 
—rsing rcos@ 0 


h h 
= -rcos¢i+—rsingdj+r(cos? ¢+sin? 6)k 
a a 
h 
= —rcos@i+ Divi +rk. 
a a 
The square of the magnitude of J is 
h2 
[a|? = 5 r? (cos? @ + sin? 6) + r? 
a 


le %. 
=(1+5)7, 


so the area element is 


h2 1/2 
dbA= (1+5) r or Og. 
a 


Using this area element and integrating over the ranges of r and @, we get 


h2 1/2 p=20 r=a 
area = (1 + a j (/ rar) dod 
a o=0 r=0 


= ravV a2 + h?. 


A useful check is provided by letting h tend to zero. The surface area then 
tends to a”, which is the area of a circle, as expected. 


113 


Unit 8 Multiple integrals 


Exercise 27 


z The parabolic reflector dish shown in cross-section in the margin has 

radius R at its opening and axial depth R/2. Its surface can be 

parametrised by two of the cylindrical coordinates, r and ¢, with 

R/2 O0<r<Rand0<¢< 2z. In terms of these parameters, points on the 
surface of the dish have Cartesian coordinates 


2 
2R° 
Find the surface area of the outside of this dish. 


Sy 


r=rcos¢é, y=rsing, z= 


Exercise 28 


A certain surface S has parameters u and v, and extends over 0 <u < 1 
and 0 <u <1. In terms of these parameters, the Cartesian coordinates of 
points on S' are 


2 7 
’ 


a= 3(u?+v"), y=5(w—v 2 = uv. 


Find the surface area of S. 


Learning outcomes 


After studying this unit, you should be able to do the following. 


e Evaluate area integrals over rectangular and non-rectangular regions 
of the xy-plane using Cartesian coordinates. 


e Evaluate volume integrals over cuboid and non-cuboid regions using 
Cartesian coordinates. 


e Evaluate area and volume integrals using polar, cylindrical and 
spherical coordinates. 


e Define the terms orthogonal coordinate system, coordinate line, scale 
factor, Jacobian factor and Jacobian vector. 


e Evaluate surface integrals over curved surfaces. 
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Solutions to exercises 


Solution to Exercise 1 


The integrand is a product of a function of x and a function of y, and the 
range of integration is a rectangle aligned with the coordinate axes, so we 
can evaluate the integral as the product of two definite integrals: 


L=2 y=3 
[ evaa= (/ 1? dr) x (/ vay) 
S z=0 y=1 


= [30°] 22 « (do) 


8 81-1 _ 160 
3 er 
Solution to Exercise 2 


This integrand does not factorise into a function of x times a function 

of y, so we must evaluate it by two successive integrations. Because the 
limits for y involve a zero, it is slightly easier to integrate over y first. 
Remembering to treat x as a constant when we perform the y-integration, 
we get 


[ate+waa= = ([" a+r wav) dx 


x=1 =0 


e=4 
= / ly + ayt+ sy" | = dx 


=1 
r=4 
= / (2 of 32) dx. 
r= 
Carrying out the remaining integration over x, we conclude that 


g=4 


x=1 


[ate+nda= [Ra + 327] 
= 15 —— 
= 30+ 24-18 8 = 45 
Solution to Exercise 3 


The required area integral can be written as 


y=r L=T 
{= / (/ cos(x + y) i) dy. 
y=0 «2=0 


Integrating with respect to x, with y held constant, we obtain 


y=n = 
[= / [sin(x + |. dy 
y=0 


= © (sin(a + y) — sin(y)) dy. 


=0 
The remaining integral over y gives 
I = [—cos(m + y) + cos(y)] aes 
= (—cos(2m) + cos()) — (—cos(m) + cos(0)) = —4. 


Solutions to exercises 
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YA 


y = cosx 


y=1-22/n 


> 


0 m/2 
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Solution to Exercise 4 


The region of integration is shown in the diagram in the margin, with a 
typical strip parallel to the y-axis marked. It is clear that the lower y-limit 
is y = 0 and the upper y-limit is y= 2 — 1. These limits are the correct 
way round because x — 1 > 0 throughout the region of integration. 


We can also immediately read off the lower and upper «-limits as x = 1 
and x = 3, respectively. The required area integral is then given by 


[te-» i= [ (ic iy) dx. 


Carrying out the inner integration over y first, we get 


x£=3 
y=ao-1 


[ie —y)dA= [zy — a ae dx 


[ (x — y)dA = [40° — 4a] 
27 


Solution to Exercise 5 


The figure in the margin shows the required region. We choose to 
integrate over y first, and so imagine dividing the region into thin vertical 
strips. The area is then given by 


z=7/2 y=cos © 
I= i / Lady | dz. 
«z=0 y=1-2a/nr 
Carrying out the integrals, we get 


sae y=Cos & 
i= I. [y]| y=1-2a/n da 


z=7/2 or 
= cosx —1+— ]dzx 
x=0 T 


z=7/2 
: x? / 
sin x — x + — 
wT) 
x=0 


= (=) ures 
eas 2 A 


Solution to Exercise 6 
(a) The first diagram in the margin shows the area of integration. 


Using this diagram, the given area integral can be written as 


[rr (fi tenes) 


(b) The second diagram in the margin shows the area of integration. 


Using this diagram, the given area integral can be written as 


[, (FE, tener) a 


Solution to Exercise 7 


The diagram in the margin shows the given region. A typical narrow 
horizontal strip extends across this region from x = 0 to x = 4— y?. These 
are the lower and upper limits of the inner x-integration. The minimum 
and maximum values of y are y = —2 and y = 2, and these are the lower 
and upper limits of the y-integration. So using equations (11) and (12), 
the area is 


Solution to Exercise 8 


Following the hint in the question, we integrate first over x, then over y. 
Considering a horizontal strip, the limits of the x-integration are 


0<axa<./1-—y?, so the area integral is 


[ temaa - ie (aa a 


lI 
oy 
ll < 
So ll 
a 
— 
Nl 
1) 
aes 
i 8 
ro) 
Pa 
| 
< 
Ls} 
Q 


You can see why it is easier to do the integration in this order: integrating 
over x first leads to $2”, and this allows us to avoid a tricky integral 
involving square roots. The ability to anticipate such things is a useful 
skill. 


Solutions to exercises 
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Solution to Exercise 9 


Choosing to integrate first over y and then over x, the required area 
integral is 


C=] var 
r= / (/ exp(x”) iy) dx 
«z=0 y=0 


lI 
ee 
ll 
ms 
8 
lo) 
tal 
Ke) 
— 
8 
bo 
~" 
Q 
8 


This tactic is chosen because the We make the substitution u = x”. Then we have du/dx = 2x, and the 


argument x? of exp(?) has lower and upper limits of integration become u = 0 and u = 1. Hence 
derivative 2x, which is fey d 

proportional to the factor x in = 1 eu 

the integrand. — Scalp exp(t) 9 dx ae 


u=1 

= :/ exp(u) du = 1Texp(u)] "9 = $(e— 1) ~ 0.859. 
u=0 

If we had tried to integrate first over x and then over y, the area integral 

would have been written as 


y=1 x=1 
i= / ( exp(x”) a) dy. 
y=0 r=y 


This is correct, but frustrating, because the integration over x cannot be 
done using standard mathematical functions. 


Solution to Exercise 10 


The mass of the block is given by the volume integral 


M= [ @+y+aav 
R 


-[- ([" (fo e+u+2a:) ay) dee 


Evaluating the inner integral over z gives 


“=2 y=2 -_ 
M= (/ [xz tyz+ Le ee iy) dx 
y 


«z=0 =1 


22 y=2 
-| (/ (30 + 3y + 24) iv) ie 
«z=0 y=l1 


Evaluating the integral over y gives 
c=2 
=2 
M= [say + By? + ay eo, dx 


II 
s—— 
ll 8 
° II 

No 
oS 


(6x +6 + 21) — (80 +3 +4) dz 
e=2 
= / (3x + 15) de. 
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Finally, the integral over x gives 
M = (8x? + 152]°_, = 36. 


So the mass of the block is 36 kilograms. 


Solution to Exercise 11 


We have 


f(x,y, z) = wyze Oty" +") 


2 2 2 
=ryze”e%e* 


x? z 


= —y? ae 
= re x ye x ze”, 


which is a product of the form u(x) u(y) w(z). The required volume 
integral over the cube therefore becomes 


L=1 5 y=. 4 2=1 5 
r= | ze” av x | ye 4 ay x | ze * dz. 
x2=0 y=0 2z=0 


The individual integrals are evaluated using a method similar to that used 


in Exercise 9. In the integral over x, for example, we substitute u = 2”. 


Then du/dx = 2a, and the limits c = 0 and x = 1 become u = 0 and 
u=1,so 


aL 
u=1 
= / te“ du 
u=0 
—y)U=1 = 
= [-ge"] |g = 3 -e”). 
Similar results apply to the y and z integrals, so the volume integral is 
I= 3(1—e*)* ~ 0.0316. 
Solution to Exercise 12 


The required volume integral is 


[senoa= fo (/, ( , “ax) iv) dz. 


The inner integral is with respect to x, and we evaluate this first (holding 
y and z constant). We get 


r=l=—y=2 1 
» _ 2 GH=1—-Y-Zz 
]. epee, 
z=0 


=2(1-y—z)=27°(1—z) —27y. 


The middle integral is with respect to y. Evaluating this (with z held 
constant), we obtain 


y=1=2 Sin 
/ ' (27(1 —z)- zy) dy = [27(1 z)y 527 y? |e) 
Y= 


Solutions to exercises 
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Finally, we integrate over z to obtain 


ail 1,4 L ao 

= [g2°— 42° + 792" 

ta. il 1 1 = 1 

=s-atio= wo 
which does agree with Example 4. 
Solution to Exercise 13 


The region R and its projection onto the xy-plane are shown in the figure 
below. 


The required volume integral is 


[ fevaae f (/, (f ‘eyzde) av) dx. 


Although the integrand is a product function, this volume integral cannot 
be expressed as the product of three ordinary integrals because the limits 
of integration are not all constants. 


The integral over z gives 
ae z=1-— 
/ xyz dz = [5$2°yz"] >, . 
z=0 
ge y(l—y)? = 5a7(y— 2y* +y”). 
The integral over y then gives 


y=1 
=1 
|, stu 2 + vay = [Be Ge? — B08 +) 
y=0 


Finally, integrating over x gives 


= 
[ tevnzyav= [O getae= [eh = 


Solution to Exercise 14 


There is no need to draw diagrams in this case because the limits are given 
in the question. However, we need to ensure that the integrals are nested 
in the right way. The required volume is 


z=] y=z x yrt+z2? 
v=/ / / 1dzx | dy } dz. 
z=0 y=0 z=0 


The integral over x is chosen as the innermost integral because its limits 
depend on the other two variables of integration, y and z. The integral 
over y is done next because its limits depend on z. The integral over z is 
chosen as the outermost integral because its limits are constants. 


Carrying out the integrals, we obtain 


z=] y=z 
V= (/ (y? + 27) iv) dz 
z=0 y=0 


Since the lengths are measured in metres, the volume is 0.33 m°. 
Solution to Exercise 15 


Recognising that «? + y? = r?, the surface density function becomes 


f(r.) = py OR? — 1?) 


in polar coordinates. Note that we continue to use the symbol f for this 
function even though it is now expressed in terms of new variables. This 
convention was discussed at the start of this book. 


The total number of bacteria on the dish is given by the area integral 


o=20 r=R 
N= ( [io rar) dg 
o=0 r=0 


C o=20 r=R 
= =/ | (2R?r — r?) dr ) dé. 
o=0 r=0 


Carrying out the integral over r first gives 


o=20 _ 
=f, [Realm we 


CG 
N = Fy X 2m x GRE = bn. 


Solutions to exercises 
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This tactic works because r? in 
exp(—r?) has a derivative that is 
proportional to the factor r in 
the integrand. 
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Solution to Exercise 16 


The semicircular region S is defined by O<r< RandO0<¢<1m. 
Recalling that « = rcos@ and including the factor r required by the area 
element in polar coordinates, we obtain 


o=n f= 
[ saa= (/ rn) rr) dd. 
Ss o=0 r=0 


The integrand is the product of a function of r and a function of ¢, and 
the limits of integration are all constants. This allows us to write the area 
integral as a product of two ordinary integrals. Hence 


r=R Q=n 
[vaa= | Parx | cos ddd 
Ss r=0 o=0 


iR? x [sing], =0. 


This answer is not surprising: the region of integration is symmetrical 
about the y-axis, but the integrand is an odd function of x, so 
contributions from x < 0 cancel those from x > 0. 


A similar calculation for y = r sind gives 


r=R o=t 
[vaa= Parx | sin ddd 
S r=0 o=0 


o=n 


—iR?x [- cos 4] 4.4 


_ 273 
=5 = 3h. 


Solution to Exercise 17 


The area integral of e-” over the entire xry-plane is 


o=20 7=00 ‘ 
I -| (| e rar) dd. 
o=0 r=0 


The integrand is a function of r only. This can be regarded as a product 
function where the function of ¢ is equal to 1. Also, the limits of 
integration are all constants. We can therefore write the area integral as 
the product of two ordinary integrals: 


o=20 r=00 : rT=06 ‘ 
r= | ao x | e! rdr=2n | e” rdr. 
o=0 r=0 r=0 


We make the substitution u = r?. Then du/dr = 2r, and the new limits of 
integration are u = 0 and u = oo. Hence 


U=CO 


u=0 


II 
4 
| 
mi 
e 
os 


=T (-e"* + e°) =T, 
where we have interpreted e~°° as being equal to zero. This is appropriate 
because e~* tends to zero as x tends to infinity. 


Solution to Exercise 18 


The region of integration corresponds to 2 <r <5,0<@< 27 and 
—1<z<1, so the required volume integral of rz? is 


Z=1 o=2r r=5 
ee / (/ @ 222 ar) i) de, 
z=-1 ~=0 r=2 


where the extra factor of r in the integrand comes from the volume 
element in cylindrical coordinates. 


In an integral such as this, where the integrand is a product of a function 
of r and a function of z, and the limits of integration are all constants, we 
can write the integral as a product of three ordinary integrals: 


ea] p=20 r=5 
4 dex | a6 x f r? dr 
z=-1 o=0 r=2 


= [Re3]20" x Om x [4r3]" 
= 8 x Qn x aa 


= 027. 


Solution to Exercise 19 


Using equation (27), the volume of the rugby ball is 


z=b 22 
van{ o (1-5) a 
z=—b b? 


=a 
= 37a°b. 


Check: When a = b = R, the rugby ball becomes a sphere of radius R, and 
the volume becomes 47 R?/3, as expected. 
Solution to Exercise 20 


The figure in the margin shows a cross-section through the spherical cap 
(shaded). At a given value of z, the surface of the cap has cylindrical 
radial coordinate 


f =Tngsl 2) = 2". 


The spherical cap has R/2 < z < R, so its volume is 


= 0 [R 2 327) ay 
= aR (1-4-$4+2) 
= 2aR°. 


Solutions to exercises 


Re — 3 
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Solution to Exercise 21 


The required volume integral is 


o=20 g=7 r=2 -: 
r= | (/ (/ nt sin dr) i) dd. 
o=0 g=0 r=1 


The integrand is a product function, and the limits of integration are all 
constants, so the volume integral can be written as 


o=20 0=n r=2 
r= | 1a6 x f sin? 9.0 xf r dr. 
o=0 6=0 r 


1 
Using the trigonometric identity sin? 6 = $(1 cos(26)), we get 


6=n 0=1 
sin? 6d9 = | (1 — cos(20)) do = 4 [0 — 4 sin(20)|?=" = An. 
6 6 


=0 =0 
So the required volume integral is 


Solution to Exercise 22 


(a) The variable r is the radial coordinate of spherical coordinates, so the 
volume integral is 


o=20 0=7 r=R 
7 * sin dd do. 
.... ay p26 (a ", f(r) r* sin ‘ w) o) 


The integrand is a product function, and the limits of integration are 
all constants, so the volume integral can be split into the product of 
three ordinary integrals: 


o=20 0=T 
/ fad = ho sinad x [ f(r) r? dr 
sphere ~=0 = 


= 2n x [- cos 6], xf f(r)r? dr 


R 
= an f f(r) r? dr, 
0 
as required. 


(b) The quantity \/a? + y? + 2? is the distance of a point from the origin, 
which is equal to the radial coordinate r in spherical coordinates. This 
can be established more formally using equations (28), which give 

go? +y?+22 =r’ sin? 6cos? +1? sin? Osin? ¢ + r? cos” 6 
=r’ sin? @ (cos” ¢ + sin? ¢) +r? cos? 0 
= r*(sin? 6 + cos? 0) =r. 


Hence the result of part (a) gives 


R 
; Vere AV = an | r? dr = TR. 
sphere 0 
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Solution to Exercise 23 


Using equation (33), we get the following results. In cylindrical 
coordinates, 


6V =1xrx lordddz =r or 6b dz, 
and in spherical coordinates, 
6V =1xrx rsind dr 6066 = r? sin 0 dr 60 6¢. 


Both of these expressions agree with our previous results. 


Solution to Exercise 24 


Taking partial derivatives of x, y and z with respect to r, 0 and ¢, we have 


Ox ; Oy. ; Oz _ 

a sin #0 cos ¢, > sin 0 sin @, a cos 0, 

Ox Oy : Oz . 
Bp 7 008 A cos &, ap — cos sin 6, ap = sine, 
So = —rsinB sind, 5H = rsindcos 6, 55 =o. 


Hence the scale factors are 


hy = \/sin? 6 cos? ¢ + sin? 6 sin? ¢ + cos? 6 
= \/sin? 0(cos? @ + sin? ¢) + cos? 0 


=, 


he = yr? cos? 0 cos? ¢ + r2 cos? @sin? ¢ + r? sin? 0 


= 4/r?(cos? 6(cos? ¢ + sin? ¢) + sin? @) 


= 75 


hg =alr* sin? @ sin? ¢ + r2 sin? 6cos? ¢ + 0 
= ,/r? sin? 0(sin? ¢ + cos? ¢) 


=rsinJ0, 
where the square roots have been taken in the knowledge that r > 0 and 
sin @ > 0 for spherical coordinates. Our answers agree with equation (36). 
Solution to Exercise 25 


The cross-section in the figure shows that the spherical cap (shaded) ZA 
extends from 6 = 0 to =a, where cosa = ($R)/R = $. 


Using the area element in equation (40), we get 


o=2n 0=a 
srenereie / ( | R? sna) de R/2 aR 
0 


=0 


Qv 


= 27R?(1 — cosa). 
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In the present case, cosa = 1/2, so the area of the cap is tR?. For the 
(practically) spherical surface of the Earth, this means that there is the 
same amount of area north of latitude 30° (the latitude of Cairo) as there 
is between latitude 30° and the Equator. 


Note: This calculation works very well in spherical coordinates. By 
contrast, finding the volume of a spherical cap works better in cylindrical 
coordinates (see Exercise 20). In the volume integral the flat base of the 
spherical cap must be taken into account, and this is more simply 
described in cylindrical, rather than spherical, coordinates. 


Solution to Exercise 26 
Expanding the determinant, we obtain 
J = R’ sin? 6cos¢i+ R? sin? sin dj + R* cos sin 6(cos* ¢ + sin? ¢) k 
= R? sin? 6cos di+ R? sin? Osingj + R? cos sind k 
= R’ sin O(sin 0 cos di + sin @sin dj + cos Ok). 
The square of the magnitude of this vector is 
|J|? = R* sin? 6 (sin? 6 (cos? ¢ + sin? d) + cos” 8) 
= R* sin? 6 (sin? 6 + cos” 6) 
=A ein: , 
We have R > 0 and 0 < @ <7, so R?sin@ > 0. Hence |J| = R? sin 6, and 
the area of a surface element is 


5A = |J| 6066 = R? sin 050 0. 


Solution to Exercise 27 


Differentiating the functions for x, y and z with respect to r and ¢, we get 


Ox bee Oy ae Oz oT 
— = — — sin —_— = 
Or * Or * OF Rf’ 
O 0 O 
5g = Tsing, 9g = Tod: ae 
so the Jacobian vector is 
i j k 
J= | cos¢ sing r/R 
—rsing rcosd@ 0 
2 a 
=" PR cos gi — Zines + r(cos* ¢ + sin? ¢)k 
2 2 
= —5 cos bi sin dj t+ rk. 
Hence 
[2 = ae em sin? ¢) +r? =r? (1 + 7) 
R? Bey: 


The area element is therefore 
re 1/2 
5A = |3|5r86 = (1+ Fs) r or od, 


and the surface area of the dish is 


o=2n r=R r2 1/2 
surface area = | / (1 + a) rdr | dd. 
o=0 r=0 


To evaluate the integral over r, we make the substitution u = 1+ r?/R?. 


Then du/dr = 2r/R?, and the lower and upper limits of integration 
become u = 1 and u = 2. So we get 


o=20 r=R d 
surface area = / | ull? x $R? se dr | do 
~=0 r=0 dr 


eo=2i 
= ing | : (V8 — 1) dd = 2(V8 — 1)R?. 


Solution to Exercise 28 


Taking partial derivatives of x, y and z with respect to u and v, we get 


Ot _ oY _ Oe 
dus Ous—“( PO 
dr ay Be 
Ov”? Ou 
Hence 
ij k 
J=|u wu vl =(u2?+v7)i- (uv? — v”)j— 2Quvk, 
v —v u 
and 


[J |? = (ut + Qu?y? + v4) + (ut — Qu?v? + v4) + 4u20? 
= 2(u* + 2u?v? + v4) 
= 2(w? + v7). 
The required surface area is 


v=1 u=1 
surface area = / ( V2(u? + v) iu) dv 


v=0 


avo f(b 402) dv = V8 [bot bY) = 9 
v=0 


Solutions to exercises 
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Unit 9 


Differentiating scalar and vector fields 


Introduction 


This unit investigates fields and explains what can be learned by 
differentiating them. 


Roughly speaking, a field is a physical quantity that has definite values at 
points throughout a region of space. For example, at a given instant, each 
point in a room has a particular temperature. Near a radiator, the 
temperature may be 40°C, but near an open door it may be only 10°C. 
We cannot say that the room has a single temperature, but each point in 
the room does have a definite temperature, and the distribution of 
temperatures throughout the room is described by a temperature field. 


If the room is a cuboid of dimensions a x b x c, we can choose a Cartesian 
coordinate system with its origin at one corner and its axes running along 
three adjacent edges of the cuboid. Then each point in the room can be 
represented by a triplet of coordinates (x,y,z), and the temperature field 
can be represented by a function 


T=1(3,9,2) (e788, 05 956 0S 2< ©), 


where the conditions in parentheses specify the domain of the function, 
which corresponds to the region inside the room. 


A second example is provided by wind velocity in a given region of the 
atmosphere. This would be of keen interest to anyone living close to the 
track of a tornado, for example! We focus on a cubic volume that is fixed 
relative to the ground, with sides of length 1 kilometre, and arrange the 
axes of a Cartesian coordinate system to run along three adjacent edges of 
this cube. At a given instant, the wind velocity may vary throughout the 
cube, but at each point it has a definite velocity, described by a velocity 
vector v. The distribution of wind velocities within the cube is described 
by a velocity field, and is represented by a function 


v=v(2,y,z) (0<2< 1000, 0< y < 1000, 0 < z < 1000), 


where x, y and z are the coordinates of a point (measured in metres), and 
the conditions in parentheses restrict attention to the region inside the 
cube, which is the domain of the function. 


At each point (x,y,z) in its domain, the function v(z, y, z) specifies a 
vector — the wind velocity at that point. A velocity vector v can be 
written in component form as 


v=vzit+vyjt+uzk, 


where i, j and k are Cartesian unit vectors, and vz, vy and vz are the 
corresponding Cartesian components. So the wind velocity field can be 
written as 


v( a4; Z) = Vel 05 Y, Z) i = Uy(ZY; z)j =e Us(£,Y;2) k, 


which involves three functions vz(x, y,Z), Vy(x, y, z) and vz(a, y, z). Each 
of these functions has the domain 0 < x < 1000, 0 < y < 1000, 
0 < z < 1000, corresponding to the cubic region of interest. 


Introduction 
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Fields are classified according to the nature of the physical quantity that 
they describe. In this module we consider two types of field: 


Figure 1 Michael Faraday 
(1791-1867) 


Figure 2. Peter Higgs 
(1929-) 


scalar fields describe the distribution of a scalar quantity (such as 
temperature) throughout a region 


vector fields describe the distribution of a vector quantity (such as 
velocity) throughout a region. 


Fields are everywhere 


The example of a temperature field arises naturally in the context of 
heating a room, and the example of a wind velocity field is clearly 
important for weather forecasters. However, the full importance of 
the field concept goes beyond what these examples suggest. 


One of the first people to suspect this was Michael Faraday (Figure 1), 
an extraordinary genius who had only a rudimentary education but 
gained entry into the scientific world by taking notes in public 
lectures, presenting a bound copy to the lecturer, and asking if he 
could help with experiments. As Faraday became more independent, 
he made revolutionary discoveries about electricity and magnetism. 


Mulling over his observations, Faraday became convinced that 
magnets and electric currents produce magnetic fields in the space 
around them, and that other magnets and electric currents respond to 
the magnetic fields that they encounter. Before long, it was realised 
that electrical and magnetic phenomena were best described using 
two vector fields: the electric field and the magnetic field. Faraday 
was not a mathematician and could not develop his ideas in terms of 
equations, but he highlighted the importance of the field concept, and 
the urgent need to develop a calculus of fields became clear. 


At first, physicists thought that electric and magnetic fields must 
describe distortions in a mysterious medium, which they called the 
ether. However, this is not the modern view: electric and magnetic 
fields (along with gravitational fields) are now regarded as part of the 
fabric of the Universe. Since the 1930s, this view has been extended 
to matter, as quantum field theories treat fundamental particles such 
as electrons or quarks as states of excitation of various fields. In 1964, 
Peter Higgs (Figure 2) and others predicted a new type of scalar field 
called the Higgs field. Nearly 50 years later, the existence of this field 
was confirmed by the Large Hadron Collider near Geneva, and Higgs 
shared the Nobel Prize for Physics in 2013. 


Study guide 


This unit is concerned with the description of fields and, in particular, 
with ways of characterising them by their rates of change with respect to 
position. These rates of change are expressed in terms of partial 
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derivatives with respect to the coordinates. You will therefore need to be 
familiar with partial differentiation, as covered in Unit 7. 


One of the themes of the preceding unit on multiple integration was the 
use of different types of coordinate system. You saw, for example, that a 
volume integral over a spherical region is simplified by using spherical 
coordinates. A similar situation applies to fields. Many of the situations 
considered by scientists involve fields with cylindrical or spherical 
symmetry, and it is then a great advantage to use cylindrical or spherical 
coordinates. You will therefore need to be familiar with the coordinate 
systems introduced in Unit 8: in particular, you will need to be familiar 
with the concept of a scale factor, as outlined in Section 4 of Unit 8. 


The unit is organised as follows. Section 1 gives essential background for 
the main topics that follow. It defines scalar and vector fields, and 
describes ways of representing them, both visually and in terms of 
equations. Section 2 defines the gradient of a scalar field. You met 
gradients in Unit 7, so part of this section is a review. However, we will go 
beyond the material in Unit 7 and show how gradients are represented in 
non-Cartesian coordinate systems. In the process, we will introduce the 
important concept of the del operator, denoted by V. 


The rest of the unit is concerned with the spatial derivatives of vector 
fields. In three dimensions, a vector field has three components, each of 
which may depend on three position coordinates, so there are nine partial 
derivatives that describe how rapidly a vector field varies with position. 
Three of these partial derivatives can be grouped together to define a 
scalar quantity called the divergence, and the other six can be grouped 
together to form a vector quantity called the curl. As their names suggest, 
divergence and curl have direct physical interpretations, which will be 
explored in this unit and the next. Section 3 discusses divergence, and 
Section 4 discusses curl. As in the rest of the unit, these quantities will be 
described in a variety of coordinate systems. 


1 Scalar and vector fields 


1.1 Preliminary remarks on scalar and vector fields 


The Introduction described a field as a physical quantity with definite 
values at points throughout a region of space. This is a fair description, 
but some clarification is needed. 


e In practice, fields are used to describe physical quantities, and it is 
generally helpful to keep real examples in mind. Physicists use many 
different types of field, but in this unit we focus on examples such as 
temperature and velocity fields that require a minimum of background 
knowledge. 


1 Scalar and vector fields 
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Figure 3. The temperature 
at P has a definite value, no 
matter what the orientation 
of the coordinate system 


a 
4b 


Figure 4 The velocity vector 
at P has a definite magnitude 
and direction, but its 
components depend on the 
orientation of the coordinate 
system 
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e Fields are classified according to the type of quantity involved. This 
module discusses scalar fields and vector fields. Other types of field 
exist as well — for example, elastic stress at each point in a solid is best 
described by a square matrix — but scalar and vector fields are by far 
the most important in physical applications. 


e In this unit we are especially interested in differentiating fields, so we 
assume that they vary smoothly from point to point. In particular, we 
assume that all the required partial derivatives exist. 


One other aspect of fields must be discussed. In the case of a scalar field, 
such as temperature, the value of the field at a given point is independent 
of the orientation of the coordinate system used to label the point. For 
example, in Figure 3, the blue and red coordinate systems are at 45° to one 
another. However, the temperature at P is 30°C whether we describe that 
point by the coordinates x = 1, y = 1 in the blue coordinate system or by 
the coordinates X = \/2, Y = 0 in the red coordinate system. Such 
invariance in value is taken as a defining property of a scalar field. 


In the case of a vector field, such as wind velocity, similar ideas apply but 
the details play out differently. At a given point P, a vector field has a 
definite magnitude and a definite direction in space, irrespective of the 
orientation of our coordinate system. For example, in Figure 4 the velocity 
field at P has a magnitude of 5 metres per second and points in a 
northward direction. We require that this meaning is preserved no matter 
what the orientation of the coordinate system. Such invariance in 
magnitude and direction is taken as a defining property of a vector 
field. Let us explore further what this means. 


In Figure 4, the bold arrow indicates the velocity vector at P (5 metres per 
second in a northward direction). If we describe this vector in the blue 
coordinate system, we get components vz = 0 and vy = 5 (in metres per 
second). But if we describe the same vector in the red coordinate system, 
the components have different values: vy = 5//2 and vy = 5/,/2. So the 
components of the vector depend on the orientation of the coordinate 
system. Nevertheless, these are just different descriptions of the same 
vector — the vector itself does not depend on the orientation of the 
coordinate system. 


Notice, by the way, that it would be incorrect to describe the components 
of a vector field as scalar fields. This is because the components of a vector 
field depend on the orientation of the coordinate system but, by definition, 
scalar fields do not. 


In any given coordinate system, fields are described by functions of several 
variables (the coordinates of points). This unit and the next will describe 
the differentiation and integration of fields, and you might wonder whether 
there will be anything new to say, beyond the topics already covered in 
Units 7 and 8. Indeed there is, and the fundamental reason for this is that 
scalar and vector fields have properties that transcend the choice of 
coordinate system. This gives us a richer structure to explore — and one 
that is directly relevant to descriptions of the physical world. 


1 Scalar and vector fields 


1.2 Describing scalar fields 


In a given Cartesian coordinate system, a scalar field is described by a 
single function. For example, a temperature field may be described in two 
dimensions by a function T(x, y), and in three dimensions by a function 
T(x,y,z). This idea was discussed in the Introduction of Unit 7, but we 
give a brief review here. 


Suppose that the temperature field on the surface of a circular disc of 
radius 1 metre is given by the function 


T(,y) =100e7 +4") (\/a2 Fy? < 1), (1) 
where T is the temperature in degrees Celsius, x and y are Cartesian 
coordinates for points on the surface of the disc (measured in metres), and 
the origin is at the centre of the disc. The condition in parentheses gives 
the domain of the function, which is the surface of the disc. Clearly, the 
centre of the disc has temperature T(0,0) = 100, and a point on the edge 
of the disc has temperature T(1,0) = 100 e~! ~ 37 (in degrees Celsius). 


There are a number of ways of representing this situation graphically. 
Figure 5(a) shows a three-dimensional diagram of the temperature on the 
surface of the disc, with x and y plotted horizontally and T plotted 
vertically. Figure 5(b) shows a slice through this surface at y = 0, which 
gives a graph of JT’ against « when y = 0. 


A 
Figure 5 Visualisation of the temperature field T(x, y) in equation (1), f 
with values in degrees Celsius: (a) a perspective view with T plotted 
vertically; (b) a graph of T against x for a slice with y = 0 50 
While both of these representations are useful, they are somewhat limited. sa 
The perspective view gives a good overall impression, but we cannot read > 
. i : ; ap ope 7] 0 12 
values accurately from its scales. The graph in Figure 5(b) is quantitative, 
but it does not tell us about the behaviour for other slices (such as a slice 
with y = 0.1). 
Perhaps the best tool for visualising a scalar field in two dimensions is to 7 
draw a contour map. Figure 6 shows a contour map for the temperature 
field described by equation (1). The orange curves are contour lines. Figure 6 Contour lines (in 
Each contour line joins neighbouring points where the temperature has a orange) for equation (1), with 
fixed value, marked next to the contour line. values in degrees Celsius 
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YA 


r sin ¢4- 


T > 
rcos@ & 


Figure 7 Polar coordinates 
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For particular scalar fields, contour lines are sometimes given special 
names. For example, lines joining points of equal temperature are called 
isotherms, and lines joining points of equal pressure are called isobars. We 
will not bother with these terms here because, from the point of view of 
mathematics, they are all the same idea. 


By studying a contour map, we can get a good idea of the form of a scalar 
field. In Figure 6, the contours corresponding to higher temperatures are 
closer to the centre of the disc, so the temperature falls as we move 
outwards. It is significant that the contour lines are circles around the 
centre of the disc. This shows that the temperature falls equally in all 
outward radial directions. 


These ideas can be extended to three dimensions. For example, the 
temperature field throughout a sphere of radius 1 metre could be given by 
the function 


T(x,y,2) =100e PtH 4) (V/a2? +42 +2 <1), (2) 


where T is the temperature in degrees Celsius, x, y and z (in metres) are 
Cartesian coordinates, and the origin is at the centre of the sphere. The 
domain of the function is given in parentheses: this is the region occupied 
by the sphere. 


In this case, rather than contour lines, there are contour surfaces joining 
neighbouring points where the temperature has a fixed value. A series of 
contour surfaces can be imagined at equally-spaced values of temperature. 
For the field specified in equation (2), these would be concentric spherical 
surfaces centred on the origin, and a cross-section through these surfaces 
at z = 0 would look exactly like Figure 6. We inevitably struggle to give 
an accurate impression of contour surfaces using a two-dimensional sketch, 
and it is generally necessary to show a cross-sectional view. 


Scalar fields in polar coordinates 


One of the themes that emerged from our study of multiple integrals was 
the importance of choosing a suitable coordinate system. Apart from 
Cartesian coordinates, three coordinate systems are of special importance 
in this module (and in physics and applied mathematics more generally). 
They are polar coordinates, cylindrical coordinates and spherical 
coordinates. It is a straightforward task to represent scalar fields in terms 
of these coordinates. We begin with polar coordinates. 


As shown in Figure 7, points in the ry-plane can be labelled by polar 
coordinates (r,¢), which are related to Cartesian coordinates (x,y) by 
the equations 


x=rcos¢?, y=rsing. (3) 


1 Scalar and vector fields 


If a two-dimensional scalar field is expressed in Cartesian coordinates as 
V(za,y), it is easy to express it in polar coordinates: we just replace x and 
y by r and ¢ using equations (3). This substitution ensures that the scalar 
field has a definite value at a given point (regardless of the coordinate 
system used to label it). 


For example, to express the two-dimensional temperature field of 
equation (1) in polar coordinates, we note that 


a? + y? = (rcos¢)? + (rsin¢)? The relationship x? + y? = r? is 
= 1? (cos? @ + sin” 4) ere ae nines 
= r, theorem in Figure 7. 
SO 
T(r,¢)=100e" — (r <1). (4) 


Notice that we have used the symbols T(x, y) and T(r, ¢) to represent the 
temperature field in Cartesian and polar coordinates, even through these 
are different mathematical functions. The arguments of the function — 
(x,y) or (r,) — indicate whether we intend the function of equation (1) or 
the function of equation (4). This follows our usual convention, used many 
times previously in this module, and for good reason: it would be 
impractical to invent new symbols for temperature (or any other physical 
quantity) every time we change coordinates. 


Notice also that equation (4) is simpler than equation (1) because it 
depends on just one variable, r. This is an important point. A 
two-dimensional scalar field that is unchanged by any rotation around the 
origin is said to be rotationally symmetric. It is generally wise to 
describe such a field in polar coordinates. 


Exercise 1 


Each of the following expressions represents a two-dimensional scalar field 
expressed in Cartesian coordinates. Find expressions for these fields in 
polar coordinates. 


(a) U(z,y)=2? -y? 
(b) V(z,y) = 2ay 


(c) W(a,y) = a ((e,y) # (0,0)) 


Scalar fields in cylindrical coordinates 
For three-dimensional fields, the most commonly used non-Cartesian 


coordinate systems are cylindrical and spherical. Figure 8(a) illustrates a 
cylindrical coordinate system (r, ¢, z). 
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Recall: in cylindrical 
coordinates, r is not the distance 
from the origin. 
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Figure 8 A cylindrical coordinate system: (a) the coordinates (r, ¢, z); 
(b) relationship to Cartesian coordinates 


The relationship with Cartesian coordinates can be found using 
trigonometry in Figure 8(b). We get 


x=rcos¢, y=rsind, z=2z, (5) 


where 


r=Va22+y¥? (6) 


is the distance from the z-axis. The first two equations in (5) are the same 
as for two-dimensional polar coordinates, while the third equation, z = z, 
shows that the z-coordinates of cylindrical and Cartesian coordinates are 
identical. 


If a scalar field is a known function of x, y and z, we can use equations (5) 
to express it in terms of cylindrical coordinates. For example, the scalar 
field 


U(a,y,z) = 2? +y? + 227 
is expressed in Cartesian coordinates. In cylindrical coordinates, it is 
U(r,¢,z) =1r* cos’? 6 +r? sin? 6 + 22” 
=p? 227, 
and this is a simpler description because it depends on two coordinates 
rather than three. Any three-dimensional scalar field that is independent 
of the ¢-coordinate of cylindrical coordinates is said to be axially 


symmetric. It is usually better to describe such fields in cylindrical 
coordinates rather than Cartesian coordinates. 


Exercise 2 


A temperature field (in degrees Celsius) is specified in Cartesian 
coordinates by 


1a, Y, z) = 100 ee? +y? +27) (a? + y" + 2 < 1). 


Express this field in cylindrical coordinates, and find the temperature at a 
point with cylindrical coordinates (r,¢, z) = (0.5, 0, 0.5). 


Scalar fields in spherical coordinates 


Figure 9(a) shows a spherical coordinate system (r, 6, ¢). 


Figure 9 A spherical coordinate system: (a) the coordinates (r, 6, 6); 
(b) relationship to Cartesian coordinates 


The relationship with Cartesian coordinates can be found using 
trigonometry in the right-angled triangles shown in Figure 9(b). We get 


x=rsinOcos¢d, y=rsindsing, z=rcosd, (7) 


where 


r=Var+y? + 2? (8) 


is the distance from the origin. We can check that this is consistent with 
the sum of the squares of x, y and z in equations (7): 
a? +y? + 27 =r’ sin? Ocos? 6 +r? sin? 6 sin? ¢ + r? cos” 6 
= r* sin? 0(cos” ¢ + sin? ¢) + r? cos? 6 
= r*(sin? 6 + cos? 6) 
=F". 
Using this expression, the three-dimensional temperature field in 
equation (2) and Exercise 2 can be expressed in spherical coordinates as 


T(r,6,6)=100e-” (r<1). 


This is simpler than the Cartesian or cylindrical descriptions because it 
depends on one coordinate rather than three or two. Any 
three-dimensional scalar field that is independent of the 6- and 
o-coordinates of spherical coordinates is said to be spherically 
symmetric. It is generally advisable to describe such fields in spherical 
coordinates. 


1 Scalar and vector fields 
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Figure 10 An arrow map for 


the velocity vector field in 
equation (9) 
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Exercise 3 


Express the scalar field 
Zz 


U(z,y, 2) = ———_— 
( y ) (x2 + y? + z2)1/2 

in terms of: 

(a) cylindrical coordinates, 


(b) spherical coordinates. 


1.3 Describing vector fields 


You have seen that a vector field is a function that associates a vector with 
each point in a given region. For example, if the surface of a river is 
treated as being flat, the velocity of water flowing on this surface can be 
represented by a function of the form 


v(z, y) = Un(Z, y) i a Uy (x, y) J, 
where the Cartesian coordinates x and y label points on the river’s surface, 
and the unit vectors i and j point in the directions of increasing x and 
increasing y. The domain of v(x, y) is the river’s surface, and within this 
domain, the functions vz(x,y) and v,(x,y) give the x- and y-components 


of the water’s velocity at any point (x,y). This is a two-dimensional vector 
field. 


To take a specific case, suppose that a river has a straight stretch between 
x = 0 and x = 100, with its banks at y = 0 and y = 10 (all measured in 
metres). Then the flow of water (in metres per second) might be described 
by the function 


v(z,y) = ylO-y)i (O< x < 100, O<y< 10). (9) 


In this simple model, all the water flows in the direction of i (the 
x-direction). The rate of flow does not depend on the downstream 
distance x, but is fastest in the middle of the river (y = 5), and drops to 
zero at either bank (at y = 0 and y = 10). 


A good way of visualising a two-dimensional vector field is to draw an 
arrow map. For a vector field in the ry-plane, this is done by choosing a 
selection of points in the xy-plane and drawing an arrow at each point. 
Each arrow points in the direction of the vector field, and has a length 
proportional to the magnitude of the field. For example, the arrow map in 
Figure 10 illustrates the velocity vector field in equation (9). The arrows 
all point in the x-direction (downstream) and are longer in the middle of 
the river where the flow is fastest. 


Another example of a vector field is provided by the gravitational field 
around a star. The gravitational field at a point P is defined to be the 
gravitational force per unit mass experienced by a small body placed at P. 
Gravity is an attractive force, and the gravitational field due to the star 
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points inwards, towards the centre of the star. The gravitational influence 
of the star decreases as we move away from it, and the magnitude of the 
gravitational field outside the star turns out to be proportional to 1/r?, 
where r is the distance from the centre of the star. 


The gravitational field of a star is three-dimensional, so it cannot be 


This is Newton’s celebrated 
inverse square law of gravity. 


YA 
captured on a two-dimensional arrow map. But we can show a 
cross-section of the field in the xy-plane, as in Figure 11. As you would \ Y y 
expect, all the arrows point towards the centre of the star, and their length . ‘ J 
increases they get closer to the star. tg ‘ J “ = 
An alternative graphical way of depicting vector fields is to sketch a field >> —> (a aa 
. ‘ : ‘ x 
line map. Instead of drawing arrows at a set of discrete points, we draw as / a 
continuous directed lines called field lines. At each point along its path, 7 4 | » x 
the direction of a field line is the direction of the vector field, and this is 
indicated by placing one or more arrows on the field line. Vector fields , A ‘ 
generally vary smoothly in space, so the field lines are generally continuous 


curves. They tell us the direction of the vector field but they do not, by 
themselves, reveal the relative magnitudes of the field at different points. 
Figure 12(a) shows field lines for the velocity vector field of equation (9), 
and Figure 12(b) shows the field lines for the gravitational field around a 
star. 


(b) 


Figure 12 Field line maps for vector fields: (a) the velocity vector field of 
equation (9); (b) the gravitational field due to a star 


So far, we have described vector fields in Cartesian coordinates and 
Cartesian unit vectors. But for vector fields with axial or spherical 
symmetry, it is often better to use non-Cartesian coordinates and 
non-Cartesian unit vectors. We briefly survey how these descriptions work 
in polar, cylindrical and spherical coordinates before looking more closely 
at the details. 


Vector fields in polar coordinates 


When a two-dimensional vector field v is described in a given Cartesian 
coordinate system, it takes the form 


v(x, y) = U2(a,y)i+ vy(@, 9) J- 


Figure 11 An arrow map for 
the gravitational field in the 
xy-plane due to a star at the 
origin 
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rsin@+- 


r cos ¢ 


Figure 13. The polar 
coordinate system 
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It is worth noting that the unit vectors i and j are an important part of 
this description — as essential as the components vz and v,. If we were to 
choose a different set of Cartesian unit vectors, pointing in different 
directions, we would have a different set of components. 


When we represent a vector field in a given coordinate system, the first 
step is to define suitable unit vectors. Let us recall how this is done in a 
two-dimensional Cartesian coordinate system. In this case, the unit 

vector iis a vector of unit magnitude that points in the direction of 
increasing x, with y held constant. Similarly, the unit vector j is a vector 
of unit magnitude that points in the direction of increasing y, with x held 
constant. These unit vectors point in fixed directions, perpendicular to one 
another. 


Now consider polar coordinates (r,@) in a plane (shown again in 
Figure 13). These are related to Cartesian coordinates by the equations 


r=rcos¢?, y=rsing. (10) 


Based on these coordinates, at any given point P, we can introduce two 
unit vectors e, and eg, as shown in Figure 14. 


ee, is a vector of unit magnitude pointing in the direction of 
increasing r, with ¢ held constant. 


e = eg is a vector of unit magnitude pointing in the direction of 
increasing ¢, with r held constant. 


Figure 14 Unit vectors and coordinate lines in polar coordinates: 
r-coordinate lines in blue, ¢-coordinate lines in orange 


These unit vectors are related to the coordinate lines marked in Figure 14. 
The r-coordinate lines (in blue) are radial paths along which r increases 
while ¢ remains constant, while the ¢-coordinate lines (in orange) are 
circular paths along which @¢ increases while r remains constant. At any 
point P, e, is tangential to the r-coordinate line through P, and eg is 
tangential to the ¢-coordinate line through P. Because the radial paths 
meet the circles at right angles, the unit vectors e, and eg are mutually 
orthogonal. 
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Using trigonometry in Figure 15, we can resolve e, and eg along the 
directions i and j to get the following useful formulas. 


e, = cosgdi+ sin dj, 
eg = —sin@i+ cos @j. 


i 


Figure 15 Relating polar 
Exercise 4 unit vectors to Cartesian unit 
i : ‘ : vectors 

Show that the vectors given in equations (11) are unit vectors that are 


orthogonal to one another. 


The Cartesian unit vectors i and j are constant vectors — they remain the yt 
same, no matter which point (x,y) is being described. But the same is not e¢ 
true for the polar unit vectors e, and eg. This is implicit in equations (11) \7 
and is illustrated in Figure 16. At point P, the radial unit vector e, points 

directly away from the origin in the direction OP; at another point Q, it : 
points in the direction OQ. The ‘transverse’ unit vector eg, which is O rill 
always perpendicular to e,, also varies with position. 


Rv 


At first sight, this may seem to be an unwelcome complication, but in co eg 

many cases it gives us the freedom to replace a complicated description in *. 

Cartesian coordinates by a simpler description in polar coordinates. The Q e, 

vectors e, and eg tell us the local radial and transverse directions at a 

given point, and these are sometimes very significant. For example, a mass Figure 16 The polar unit 
at the origin produces an inward radial gravitational field, which is best vectors vary from one point 
described using radial unit vectors — this is more natural and simpler than to another 

introducing Cartesian unit vectors, which point in arbitrary directions 

unrelated to the direction of the gravitational field. 


An example of a two-dimensional vector field expressed in polar 
coordinates is provided by 


v(r, d) = 2re,. (12) 


At each point, this field points radially outwards, and its magnitude is 
proportional to the distance from the origin. The corresponding arrow - | F 
map is shown in Figure 17, where you should note that the arrows point _ 

radially away from the origin, and have a length that increases steadily as i 

we move further away from the origin. Bedi lal 


The vector field given in equation (12) can be converted to Cartesian we ANON 
coordinates by using equations (10) and (11). At a point with coordinates ew yg Yo™ 


(2,y) we get M 
v(x, y) = 2r(cos i+ sing@j) = 2xi+ 2yj. 


This description is entirely equivalent to equation (12), but the link to Figure 17 An arrow map for 
Figure 17 is a little less obvious. In this case, polar coordinates provide the the vector field in 
simplest and most transparent description of the field. equation (12) 
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Figure 18 The cylindrical 
coordinate system 
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Vector fields in cylindrical coordinates 


Three-dimensional vector fields can also be described in non-Cartesian 
coordinates. Cylindrical coordinates (r, ¢, z) are shown again in Figure 18. 
They are related to Cartesian coordinates by the equations 


zr=rcos¢?, y=rsing, z=2z, (13) 
where r = \/a? + y? is the distance from the z-axis (not the distance from 
the origin) in this coordinate system. 


At any given point P, we can introduce three unit vectors e,, eg and ez, 
based on cylindrical coordinates. These are illustrated in Figure 19 and 
defined as follows. 


ee, points in the direction of increasing r, with ¢ and z held constant. 
e eg, points in the direction of increasing ¢, with r and z held constant. 


ee, points in the direction of increasing z, with r and ¢ held constant. 


Each of these unit vectors is tangential to a particular coordinate line 
along which just one of the cylindrical coordinates (r, ¢ or z) increases 
while the other two remain fixed. 


Figure 19 Unit vectors and coordinate lines in cylindrical coordinates: 
r-coordinate lines are shown in green, ¢-coordinate lines in red and 
z-coordinate lines in blue 


The unit vectors e, and eg lie in a plane parallel to the xy-plane and are 
identical to the polar unit vectors, while the unit vector e, points in the 
z-direction and is identical to the unit vector k of Cartesian coordinates. 
Hence equations (11) still apply for e, and eg, and we have the following 
results. 


e, = cos di+sin dj, 
Ef = SSN RCS, (14) 
ey =k: 


These equations show that the unit vectors e, and eg depend on position, 
but e, remains constant, being equal to the Cartesian unit vector k. 


An example of a vector field expressed in cylindrical coordinates is given by 
v(r,¢, 2) = reg. (15) 


The field lines for this field are circles around the z-axis, as shown in 
Figure 20. 


Vector fields in spherical coordinates 


Spherical coordinates (r,0,@) are shown again in Figure 21. They are 
related to Cartesian coordinates by the equations 


x=rsinécos¢d, y=rsindsing, z=rcosd, (16) 


where r = \/x? + y? + z? is the distance from the origin. 


At any given point, we can introduce three unit vectors e,, eg and eg. 
These are illustrated in Figure 22 and defined as follows. 


ee, points in the direction of increasing r, with 0 and ¢ held constant. 
e eg points in the direction of increasing 0, with r and ¢ held constant. 


e eg points in the direction of increasing ¢, with r and @ held constant. 


Each of these unit vectors is tangential to a coordinate line along which 
just one of the coordinates (r, @ or ¢) increases while the other two remain 
fixed. 


Figure 22 Unit vectors and coordinate lines in spherical coordinates: 
r-coordinate lines are shown in green, 6-coordinate lines in blue and 
o-coordinate lines in red 


The spherical unit vectors can be visualised as follows. Imagine a spherical 
coordinate system with its origin at the Earth’s centre, and a positive 
z-axis that points from the origin through the North Pole. Then at a 
typical point on the Earth’s surface, the vector e, points vertically 
upwards, the vector eg points southwards, and the vector eg points 
eastwards. Obviously, these three unit vectors are mutually orthogonal. 
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Figure 20 Field lines for the 
vector field in equation (15) 


Figure 21 The spherical 
coordinate system 


There is no unique southward 
direction at the North or South 
Pole, but such isolated 
exceptions can be safely ignored 
in this module. 
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g is the gravitational force per 
unit mass experienced by a body 
in the vicinity of the star. 
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Note carefully that e, points radially away from the origin in spherical 
coordinates. This is not the same as the direction of e, in cylindrical 
coordinates (which points radially away from the z-axis). For this reason, 
it is always important to state clearly which coordinate system is being 
used. You cannot assume that the symbols speak for themselves. 


While the meanings of the spherical unit vectors are clear, the 
three-dimensional trigonometry needed to express e,, eg and eg in terms of 
the Cartesian unit vectors i, j and k is rather cumbersome. For the 
moment, we just quote the results. 


e, = sinf@cos i+ sin dsin dj + cos 0k, 
eg = cos cos i+ cos dsin dj — sin dk, (17) 
eg = —sin i+ cos¢j. 


You should take these equations on trust for the moment; in Subsection 1.5 
we will derive them using an alternative (non-geometric) route. 


An example of a vector field that is conveniently described in spherical 
coordinates is the gravitational field around a star. With a spherical 
coordinate system centred on the star, this field is 


g(r.0,¢) =-“e, (r>0), (18) 


where G is a positive constant (called the constant of gravitation) and M 
is the mass of the star. 


Since M, G and r are all positive, the minus sign on the right-hand side of 
equation (18) implies that the gravitational field vector g points in the 
opposite direction to e,. In other words, it points radially inwards, towards 
the star, corresponding to gravitational attraction. You can also see from 
equation (18) that the magnitude of g is independent of @ and ¢, and 
decreases as 1/r? as the distance r from the star increases. The ability to 
read the meaning of equations in this way is a valuable skill for scientists, 
engineers and all who need to relate equations to the real world. 


1.4 Vector field conversions 


In principle, you are free to choose whichever coordinate system you like, 
but the physical situation often singles out a preferred coordinate system — 
one that makes calculations easier. For all their apparent simplicity, 
Cartesian coordinates are not always the best choice. For example, the 
gravitational field of a star has spherical symmetry, and is best described 
in spherical coordinates, as in equation (18). 


We may be given a vector field in one coordinate system, and wish to 
express it in another coordinate system. 


As an example, suppose that we are given a vector field 
v(x, y) = Uz(x,y)i+ vy(z,y)j (19) 


in Cartesian coordinates, and we want to express it in polar coordinates, 
that is, in the form 


v(r, ?) = Ur Cs a) e; + ug(r, ?) eg: (20) 


We need to find the functions v;(r,¢@) and vg(r, ¢) that describe how the 
polar components of the given field vary from point to point. 


A vector field does not depend on the orientation of the coordinate system 
used to describe it. So if a particular point has Cartesian coordinates (2, y) 
and polar coordinates (r,¢), we must have 


v(x,y) = v(r, 9). (21) 


On the understanding that equations (19) and (20) refer to the same point, 
we can simplify our notation by omitting the arguments (xz, y) and (r, ¢). 
We write 


V¥ =v,1+ vj, (22) 
V = Ur €; + Ug €g- (23) 


Both of these equations refer to the same vector v, expressed in different 
coordinate systems. 


The key to finding an expression for v, is to note that e, and eg are 
orthogonal and of unit magnitude, so e, -eg = 0 and e, - e, = 1. Taking 
the scalar product of both sides of equation (23) with e, then gives 


Cr + V = Ur Cr + Cr + Ug Cr * C6 = Ur. 
So 
Up = ep ev. 


The scalar product on the right-hand side of this equation can be 
expressed in Cartesian coordinates by taking e, from equations (11) and 
v from equation (22). We get 


ur = (cos di+sin dj) + (vzit vyj) =cosduz +sing vy. 


An expression for v, can be found in a similar way. In this case, taking the 
scalar product of both sides of equation (23) with eg gives 


Cg ° V = Up Cd + Cr + Ug Cd * Cd = UG- 

So 
Ug = g° V. 

Using equations (11) and (22) to expand the scalar product then gives 
vg = (— sin di+ cos dj) + (vzit vyj) = — sind vz + cos p dy. 
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A field like this could describe 
the velocity at points on a 
steadily rotating turntable. 
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Example 1 

A two-dimensional vector field is expressed in Cartesian coordinates as 
v(z,y) = —-yitaj 

Express this field in polar coordinates (r, @). 

Solution 

In polar coordinates, the field is expressed as 


v(r, db) = Up & + vg eg. 


We have 
Up =ep ev 
= (cos di + sin @j) - (-yi +5) 
=-—ycos@+azsing. 
Similarly, 
Ud = eg “Vv 
= (—sin $i + cos) - (-yi+ xj) 
= ysing+xcos@. 


The coordinate transformation equations for polar coordinates are 
x=rcos¢ and y=rsing. Using these, we get 


u, = —(rsin¢) cos¢ + (rcos¢) sing = 0, 
vg = (rsin¢) sin ¢ + (rcos ¢) cos ¢ = r(sin? ¢ + cos” ¢) =r, 
So in polar coordinates, 


v(r,o) = reg. 


The method that we have just used to convert a vector field expressed in 
Cartesian form into polar coordinates can be extended. It relies on the fact 
that the polar unit vectors e, and eg are orthogonal. A similar method 
works in cylindrical and spherical coordinates, and all other orthogonal 
coordinate systems. 


To take a general three-dimensional case, suppose that we are given a 
vector field F in Cartesian coordinates, 


F = F,i+ Fyj+ Fk, 


and we wish to express it in an orthogonal coordinate system (u,v, w), 
with orthogonal unit vectors e,, e, and e,. Then we write 


Fr = Fy ey + Fy ey + Fy ew, 
where the components F,,, /, and F,, are unknown functions of (u,v, w). 


These functions can be found using the following procedure. 


Procedure 1 Finding components in orthogonal coordinates 


For a vector field F in an orthogonal coordinate system (u,v, w), the 
component F), in the direction of e, is found as follows. 


1. Write down 
Fy, = Sy, 9 rE 


and expand the scalar product on the right-hand side using 
Cartesian expressions for e,, and F (involving i, j and k). 


2. The resulting expression generally depends on (z,y,z). Use 
coordinate transformation equations of the form 


A Satie i == OO GON, | = OR e0) 


to obtain an expression for F, solely in terms of u, v and w. 


The following example illustrates how this procedure is used. 


Example 2 
In Cartesian coordinates, a vector field takes the form 
B(z, y, 2) = yzi-xzjt+ 2aryk. 
Express this field in cylindrical coordinates (r, ¢, z). 
Solution 
We write the field as 
Br, 0,2) = Be, + Bgeg + By ez, 
where the cylindrical unit vectors are given by equations (14): 
e, =cosdi+singdj, eg =—singit+cosdj, e, =k. 
Using Procedure 1, we get 
B, =e,-B= (cos¢i+singj) - (yzi- xzj+ 2ryk) 
= yzcosd — xzsin gd, 
Bg =eg-B = (—singi+cos¢j) + (yzi- xzj+ 2ryk) 
= —yzsingd — xzcos ¢, 
B,=e,-B=k- (yzi-xzj+ 2ryk) = 2zry. 
These expressions still involve (x,y,z), but in cylindrical coordinates 
z=rcos¢, y=rsingd, z=2, 
so we get 
B, = (rsin ¢) zcos ¢ — (rcos¢) zsing = 0, 
Bg = —(rsin @) zsin¢ — (rcos ¢) zcos¢ = —r2(sin? ¢ + cos” ¢)=-rz, 
B, = 2r* cos ¢sin ¢ = r? sin(2¢). 
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Hence in cylindrical coordinates, 


B(r, ¢,z) = —rzeg + r* sin(2¢) ez. 


Exercise 5 
In Cartesian coordinates, a vector field takes the form 
F(z, y,2) = (2 +y)i+ (y—2)j+3zk. 


Express this field in cylindrical coordinates (1, @, z). 


Exercise 6 
In Cartesian coordinates, a vector field takes the form 
F(a;7,2) =k. 
Use equations (17) to express this field in spherical coordinates (r, 0, ¢). 


You have seen how to convert a vector field from Cartesian coordinates to 
non-Cartesian orthogonal coordinates. We sometimes need the opposite 
conversion, from non-Cartesian to Cartesian coordinates. For example, a 
vector field may be given in spherical coordinates as 


1 
arc 


To express this field in Cartesian coordinates, we begin by using 
equations (17) to express e, in terms of i, j and k. This gives 
1 
F = -, (sin@cos ¢i + sin @sin ¢j + cos@k). 
r 


The remaining task is to express the quantities involving r, 0 and ¢ in 
terms of x, y and z. This can often be done by inspection. In the present 
case, the coordinate transformation equations 


x=rsinécos¢, y=rsinésing, z=rcosé 
give 

sin 6 cos ¢ = = sin @sin ¢@ = 2 cos6 = =. 
Also, it is clear from geometry (and equation (8)) that the distance from 
the origin is r = (x? + y? + 2?)!/?. Hence the vector field can be expressed 


as 
1 
P= (“i+ 25+ =k) 
r r r r 
vi+yjtzk 


~ (a2 + y2 + 22)3/2" 


1 Scalar and vector fields 


Exercise 7 
In spherical coordinates, a vector field takes the form 
F(r,0,¢) = cos0e, — sin ep. 


Express this field in Cartesian coordinates. 


1.5 Unit vectors and scale factors 


This subsection develops the theme of scale factors introduced in 
Unit 8. It shows that unit vectors in all coordinate systems can be 
obtained from a single formula (equation (27)). This useful formula 
allows us to avoid elaborate geometric arguments in three dimensions. 
It will be used later on, but its derivation is not assessed. 


To take a unified view, we consider any three-dimensional coordinate 
system with coordinates (u,v,w). The coordinates u, v and w are related 
to Cartesian coordinates by equations of the type 


2=2(u,0,0),. y= y(a,ow), 2=2(u,0,w). 


In spherical coordinates, for example, (u,v, w) = (r,0,¢), and these 
equations take the form 


x=rsinOcos¢?, y=rsindsing, z=rcosé. 


We can define coordinate lines along which just one coordinate increases 
while the other two remain fixed. Along a u-coordinate line, for example, u 
increases while v and w have fixed values. Then we can imagine taking a 
small step along a u-coordinate line. In general, the chain rule tells us that 


Ox Ox Ox 
ba = a out a out a Ow. 


But along a u-coordinate line, v and w are held constant, so dv = 0 and 
dw = 0, giving 


Similarly, 


_ Oy _ Oz 
y= a ou and bz = a Ou. 


The displacement along the u-coordinate line produced by a small increase 


in u is therefore given by the vector Unit 8, Subsection 4.2 gave a 
ax Oy es similar argument. 
dvi+ dyj+dzk= | —i+ —j+—k] du. 24 
aoa we (Seis ie ) “ a 
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v-coordinate line 


Figure 23 Tangent vectors, 
unit vectors and scale factors 
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This vector describes a tiny displacement along the u-coordinate line in the 
direction of increasing u. The crucial point is that this is just the direction 
of the required unit vector e,. The factor du in equation (24) simply scales 
the vector in brackets without changing its direction. We can therefore 
concentrate on the vector in brackets, which is denoted by 


(25) 


Because this vector is tangential to the u-coordinate line, it is called a 
tangent vector. 


To obtain the unit vector e,, we just need to scale T,, by 1/|T,,|. So we get 


1 


ee Le 
|T.| “ 


(26) 


eu 


where 


mea = (22) + (2) + (2). 


Referring back to Unit 8, Section 4, you can see that |T,,| is just the scale 
factor h,, for the u-coordinate. We have therefore derived the following 
result for any unit vector. 


Calculating unit vectors 
In any orthogonal coordinate system (u,v, w) with 
Ga ey, 2 — (ict), 


the unit vector e,, in the direction of the u-coordinate line is related 
to the Cartesian unit vectors by 


C—O, Ww), 


er ou de 
where h.,, is the scale factor 
Ox \? Oy dz\? 
hm= (52) + (au) + (3a) oe 


Of course, similar equations apply for two-dimensional coordinate systems 
in the xry-plane, but without the terms involving z and k. 


The geometric relationship between tangent vectors, unit vectors and scale 
factors is illustrated in Figure 23. 


Example 3 


Use tangent vectors to derive expressions for the unit vectors e, and eg of 
polar coordinates. 


2 The gradient of a scalar field 


Solution 


In polar coordinates we have x = rcos¢ and y = rsin@. So the tangent 
vectors in the r- and ¢-directions are 


i — Or Oye = cos gi+ sin dj, 
Or Or 
Obs Oe . sas : 
Tig pg) ee 


These vectors have magnitudes 


hy = \/cos? d + sin? @ = 1, 
hg = V(—rsin ¢)? + (rcos ¢)? =r, 


which are the familiar scale factors of polar coordinates. So the unit 
vectors are 


1 
e, = — T, =cos¢i+sin dj, 
hy 
1 
ey = _ Ts = —sin di+ cos dj, 
@ 


in agreement with equations (11). 


Exercise 8 


Use tangent vectors to derive expressions for the unit vectors e,, eg and eg 
of spherical coordinates. 


2 The gradient of a scalar field 


We can now discuss the main topic of this unit — describing how rapidly 
fields change in space. This section describes the spatial rates of change of 
scalar fields, while Sections 3 and 4 describe spatial rates of change of 
vector fields. 


The spatial rates of change of a scalar field V(x, y, z) can be described 
using three partial derivatives: OV/Ox, OV/Oy and OV/0z. An important 
quantity can be constructed from these — the gradient vector, which you 
met in Unit 7. Subsection 2.1 gives a brief review of the gradient vector in 
Cartesian coordinates, covering much the same ground as in Unit 7. In 
Subsection 2.2 we go further and explain how the gradient vector is 
calculated in non-Cartesian coordinates. 
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The symbol V is read as ‘del’ or 
‘nabla’. 


Have another look at Unit 7 if 
you need to refresh your 
memory. 
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2.1 Gradients in Cartesian coordinates 


Suppose that V(x, y, z) is a three-dimensional scalar field expressed in 
Cartesian coordinates. Then you know how to find the gradient of this 
field. According to Unit 7, 
OV OV OV 
dV = —i+—j+—k. 
Brae = On S BI” Bz 
This is often written in a different notation, using the symbol V instead of 
grad. In this notation, the gradient of V(z, y, z) is 
OV OV OV 
VV =—i+—ji—k. 30 
Ox . Oy J+ Oz eo) 
Because VV is a vector, the symbol V is printed in bold type, and should 
be underlined in handwriting. We will use the notations grad V and VV 
interchangeably. 


(29) 


For a two-dimensional scalar field V(x, y), the gradient is 


OV. 


d = — ee 
gradV = VV Aa Dy 


Example 4 
Calculate the gradient of V(x, y, z) = ry?z°. 
Solution 
Using equation (29), the gradient is 
OV. OV, Ow 


= yz it Qaeyz?j + 30y?z7 k. 
In the alternative V notation, this is written as 


VV =y' 2 it 2ayz? j + 3ay?2" k. 


Exercise 9 

Find the gradients of the following functions. 
(a) V(a,y) = er tv" 

(b) V(a,y,2) = er tt 


Properties of gradient 


The properties of the gradient vector were outlined in Unit 7, mainly for 
functions of two variables. Given a function V(z, y), the gradient vector 
grad V at any given point P is a vector in the xy-plane. This vector 
satisfies the following properties. 


2 The gradient of a scalar field 


e = Its direction is that in which V increases most rapidly. This direction 
is perpendicular to the contour lines of V(z, y). 


e Its magnitude is the maximum rate of increase of V with respect to 
distance travelled in the xy-plane. 


Exercise 10 


The figure in the margin shows a contour map of a two-dimensional scalar 
field V. Use arrows to indicate the directions and relative magnitudes of 
grad V at points A, B and C. You should use the convention that an 
arrow represents the value of grad V at its own tail. The absolute lengths 
of the arrows are arbitrary, but you should choose their relative lengths 
appropriately. 


Exercise 11 
A two-dimensional scalar field is given by 
V(z,y) =In(Va? + y?)  ((2,y) # (0,9)). 


Find the gradient of this field, and draw a sketch showing the contour line 
of V through (1,1) and an arrow representing VV at this point. 


Most of the scalar fields met in science vary in three-dimensional space. 
The properties of gradient in three dimensions are natural extensions of 
those in two dimensions. 


Properties of gradient in three dimensions 


Given a function V (2, y, z), at any given point grad V is a vector in 
three-dimensional space. 


e = Its direction is that in which V increases most rapidly. This 
direction is perpendicular to the contour surfaces of V. 


e Its magnitude is the maximum rate of increase of V with respect 
to distance travelled in three-dimensional space. 


All these properties of gradient, in two and three dimensions, are taken as 
known facts in the context of this unit. For scalar fields, however, we can 
take things a step further. 


Remember that, by definition, the values of a scalar field V do not depend 
on the orientation of the coordinate system. It follows that at any given 
point, the direction and magnitude of steepest increase of V, and hence the 
direction and magnitude of grad V, do not depend on the orientation of 
the coordinate system. This means that grad V is a vector field in the full 
sense of the term, as defined in Subsection 1.1. 
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We will use the terms gradient, 
gradient vector and gradient 
vector field interchangeably. 


Following our usual convention, 
f labels two different functions: 


f(x,y, 2) and f(X, Y, Z). 
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Gradient of a scalar field 


The gradient of a scalar field is a vector field: at each point, grad V 
has a definite magnitude and a definite direction in space that do not 
depend on the orientation of the coordinate system. 


This conclusion is a deep one. Suppose that a Cartesian coordinate system 
has coordinates (x,y, z) and unit vectors i, j and k, and that it is rotated 
to create another Cartesian coordinate system with coordinates (X, Y, Z) 
and unit vectors I, J and K. Then a scalar field 


nate Y, Zz) = ,e.e x, Z) 
has gradient 


Of e3 OF 2 Of of 
grad f= a oO Ne ax it ay AZ 


The equality of the two expressions on the right is not a trivial fact, but is 
guaranteed because grad f is a vector field and therefore cannot depend 
on the orientation of the coordinate system used to describe it. 


Exercise 12 


A three-dimensional scalar field is given by 


Vow) = epee ((ea2) # (0.0.0). 


Calculate the gradient of this field. 


Exercise 13 


The temperature (in degrees Celsius) in a certain region of space is given 
by the scalar field 


T(x, y, 2) = 1000 exp(—(x? + 2y? + 227)), 
where x, y and z are measured in metres. 
(a) Calculate the gradient of this scalar field at the point (1,1, 1). 


(b) Specify a unit vector n that gives the direction of the most rapid 
increase in temperature on moving away from (1, 1,1). 


Gradients and small changes 


The components of VV are 0V/0x, OV/Oy and OV/0z, which are the rates 
of change of V in the z-, y- and z-directions. However, if we know VV, we 
can also deduce how V changes in any other direction. 


Suppose that we make a tiny displacement from a point (x,y,z) toa 
neighbouring point (a + dz, y + dy,z+0z). As a result of this 
displacement, a scalar field V(a, y, z) changes by 


2 The gradient of a scalar field 


6V = V(a + 6x, y+ Oy, z + 6z) ~~ V (2,y,2); 


and the chain rule tells us that 


OV OV OV 
VE ye Gt ag 


Now, the right-hand side of this equation can be rewritten as a scalar 
product: 


it SF i+ Fk) -(d2i+ dyj+6zk). 


is the gradient of V, and 
dxi+ dyj+ozk=ds 


is the displacement vector, so we reach the following conclusion. 


The small change in a scalar field V due to a small displacement ds is 


5V ~ VV - 6s. (32) 


If we know the gradient of a scalar field at a given point, we can use 
equation (32) to estimate the change in V that occurs when we make a 
small displacement ds away from that point. 


If the tiny displacement vector 6s covers a distance ds in the direction of 
the unit vector nh, we can write 6s = nds, so 


OV ~ (VV -n)os. 
Then dividing both sides by 6s and taking the limit as the distance 6s 


tends to zero, we see that 


=A VV on (33) 


The rate of change of V with distance in the direction of the unit 
vector n is the component of the gradient vector VV in the direction 
of n. 


Exercise 14 
A scalar field takes the form V(x, y,z) = (x? + y? + z?)8/?. 


(a) Estimate the change in V between the points (1, 2,2) and 
(0.98, 1.99, 2.01). 


(b) Find the rate of change of V at the point (1, 2,2) in the direction of 
the unit vector n = (3i + 4k)/5. 


In the limit where 6s tends to 


zero, our approximations 
become exact. 
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low T’ 


Figure 24 Heat flows in the 
opposite direction to the 
gradient of temperature 


*) high potential energy 
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Figure 25 The gravitational 
force acts downwards, in the 
opposite direction to the 
gradient of potential energy 
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Gradients of scalar fields in the real world 


Gradients often appear in mathematical descriptions of physical 
systems. You know that heat flows from the hot ring of an electric 
stove to a colder saucepan, warming its contents. In general, heat 
flows from regions of high temperature to regions of low temperature, 
a process that tends to reduce spatial variations in temperature. 


For any temperature field T(z, y, z), the temperature gradient grad T 
plays a vital role in determining how heat is conducted. 


As indicated in Figure 24, heat flows in the direction in which 
temperature decreases most rapidly, which is the direction of 
—gradT. Moreover, the rate of flow of heat at each point is 
proportional to the magnitude of grad 7. We can therefore say that 


heat flow «x —grad T. 


Similar results apply when pressure P or molecular concentration C’ 
vary in space. Again, flows arise that tend to reduce these spatial 
variations, and which are proportional to —grad P and —gradC. 


Gradients are also important in mechanics. Physicists use a concept 
called potential energy. For example, close to the Earth’s surface, the 
gravitational potential energy of an object of mass m is V = mgz, 
where g is a constant and z is the object’s height above the ground. 
Whenever we are given a potential energy field V(z, y, z), we can 
deduce the corresponding force by taking minus its gradient. In the 
case of terrestrial gravity, 


F = —grad (mgz) = —mgk, 


which is a constant force that acts vertically downwards (Figure 25). 


2.2 Gradients in non-Cartesian coordinates 


In situations where there is axial or spherical symmetry it is usually best 
to describe a scalar field in polar, cylindrical or spherical coordinates. In 
this subsection we explain how to derive the corresponding gradient vector 
fields, also in polar, cylindrical or spherical coordinates. 


We therefore consider a general three-dimensional coordinate system with 
coordinates (u,v, w) and corresponding unit vectors e,, ey and e,. We 
assume that the coordinate system is orthogonal, which means that these 
three unit vectors are mutually orthogonal. If you want to keep a specific 
example in mind, you can imagine that (u,v, w) stand for (r,6,@) of 
spherical coordinates, but the strength of our argument is that (u,v, w) 
can represent any orthogonal coordinate system. 


If we are given a scalar field V, this can be expressed either in Cartesian 
coordinates or in the (u,v, w) coordinate system. If (x,y,z) and (u,v, w) 


2 The gradient of a scalar field 


label the same point, we can write 

VG Y; z) = V(u, v, w). 
The gradient of V is a vector field, which can be expressed in Cartesian 
coordinates as 


VV = it jt k (34) 
x z 


The same vector field can also be expressed in the (u,v, w) coordinate 
system. We do not know the components of VV in this system, but we 
can write 


VV = Ju Cu + Gu Cv + Gu ew; 
where the components gy, gy and gy remain to be determined. 


Now, the crucial step is to recall that the gradient of a scalar field is a 
vector field — a field whose vector values at each point are independent of 
the orientation of the coordinate system. This means that we can use 
Procedure 1 to write, for example, 


Ju = eu VV. 


We then evaluate this scalar product using Cartesian expressions for ey, 
and VV. In Subsection 1.5 you saw that there is a general expression 
for e,. According to equation (27), 


a 1 Gas Oy. sk), 


Bu | Ou 


where h,, is the scale factor for the u-coordinate. Combining this with the 
Cartesian expression for VV given in equation (34), we get 


1 (Ox, Oy. Oz OV. OV. OV 
ia Dy (Si | sit 5k) (> “ dy? | Oz k) 
_ 1 (OV Ox OV Oy | OV Oz 
=e (RRR et ER): 
Finally, the term in brackets can now be simplified using a version of the 
chain rule (given in Unit 7, equation (24)). We conclude that 


and there are similar results for the other two components, gy, and g,,. We 
therefore reach the following very useful conclusion. 


Gradient in a general orthogonal coordinate system 


Given an orthogonal coordinate system (u,v, w) with unit vectors e,, 
e, and e,, and scale factors hy, hy and hy, the gradient of a scalar 


field V(u,v, w) is For a scalar field V(u, v) in two 
dimensions, the same formula 
VW= L ov ene 4 ov ene Lt ov eu (35) applies, but with the last term 
hy Ou hy Ov hy Ow (involving e,,) omitted. 
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The domains of V and VV 
exclude the origin r = 0. 
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This remarkable result applies in all orthogonal coordinate systems. This 
includes the Cartesian, polar, cylindrical and spherical coordinates studied 
in this module, as well as other orthogonal coordinate systems that are in 
occasional use. 


In two- and three-dimensional Cartesian coordinate systems, the scale 
factors are all equal to 1, and equation (35) gives 


VW= a e, + - e, in two dimensions, 
OV OV V 
VV= Dx ey + Dy ey + Or e, in three dimensions. 


Here, of course, e;, ey, and e, are Cartesian unit vectors in the z-, y- and 
z-directions, more usually written as i, j and k. So we recover the 
definition of gradient in Cartesian coordinates in equations (30) and (31). 


In polar, cylindrical or spherical coordinates, the key to using 

equation (35) is to know the appropriate scale factors. These were given in 
Unit 8 in the context of finding area or volume elements for multiple 
integrals. For ease of reference, the results are summarised below. 


Table 1 Scale factors of common coordinate systems 
Coordinate system Scale factors 


Polar coordinates (r, b) hp =1, hg =r 
Cylindrical coordinates (r,¢,z) hr=1,hg=r,hz=1 
Spherical coordinates (7,0,0) hyp =1, he =r, hg =rsne 


Using Table 1 and equation (35), we can immediately write down the 
expression for gradient in polar coordinates. With h, = 1 and hy =r, we 
get 


OV 1 0V 
VV =—e,+-— ej. 
Or” a r Oo = 
You can use this formula to find the gradient of any scalar field given in 
polar coordinates. 


(36) 


Example 5 


Exercise 11 considered the scalar field V(x, y) = In(,/2? + y?) in Cartesian 
coordinates. In polar coordinates, this field takes the form V(r, ¢) = In(r). 
Find the gradient of V(r, ¢) in polar coordinates. 


Solution 

The partial derivatives of V(r, ¢) = In(r) are 
OoV it OV 
Orr Mt ag = 


so for this scalar field, 


2 The gradient of a scalar field 


1 1 1 
VV =—e,+—leg=—e, (r #0). 
r r ic 


This gradient field has magnitude 1/r and points radially outwards, away 
from the origin. This answer agrees with that of Exercise 11, but has been 
obtained more efficiently. 


Using Table 1 and equation (35), we can also find expressions for gradient 
in cylindrical coordinates and spherical coordinates. 


In cylindrical coordinates, the scale factors are h, = 1, hg =r and hz = 1, 
SO 


OV 1 0V OV 
VV =—e,+-— — e,. oT 
ee oe en) 
In spherical coordinates, the scale factors are h, = 1, hg =r and 
hg =rsin9@, so 
OV 1 0V 1 OV 
WY Or te Og 8 * Fsind OG 
You should not bother to memorise these results, as it is easier to recall 
the general shape of equation (35) and the relevant scale factors. 


(38) 


Exercise 15 
Exercise 12 considered the scalar field V(x, y,z) = 1/(x? + y? + z?)3/? in The domains of V and VV 
Cartesian coordinates. In spherical coordinates, this field takes the form exclude the origin r = 0. 
1 
Vig 0, ?) = pe 


Calculate the gradient of V(r, 6,@) in spherical coordinates. 


Exercise 16 
In cylindrical coordinates, a scalar field takes the form 
f(r, 9, 2) =r? sin(2g) + 27. 


Calculate the gradient Vf, and hence find the magnitude of the gradient 
at any point. 


Exercise 17 

In spherical coordinates, a scalar field takes the form 
T(r0,0) =remg. 

(a) Find the corresponding gradient vector field. 


(b) Find the rate of change of T with distance in the direction of the unit 
vector n = (eg + eg) /V2. 


161 


Unit 9 Differentiating scalar and vector fields 


162 


2.3. The del operator 


The process that leads from a scalar field to a gradient field can be 
described using a concept called the del operator. For the moment, this 
just repackages what you already know, but you will soon see that the del 
operator is useful in other contexts. 


Essentially, an operator is something that ‘acts on things to produce other 
things’. For example, a particular rotation operator may represent a 
rotation by 30° about the z-axis; when this operator acts on any position 
vector, it produces another position vector. 


We are interested in operators that act on functions to produce other 
functions. An example is the differentiation operator d/dx. This acts on 
any differentiable function f(x) to produce another function, f’(x). 


In Cartesian coordinates, the del operator is defined by 


0 O O 
V=—i+—jin-k. 
dp * By) ° Bz (39) 
When this operator acts on a scalar field V(x, y,x) expressed in Cartesian 
coordinates, it produces the gradient vector field 
OV. OV, OV 


VV =— a a 
us ig Ba” Oe : 


in agreement with equation (30). 


When the del operator acts on a given scalar field, it produces a definite 
gradient vector field. However, the scalar field and the resulting gradient 
vector field can be described in various coordinate systems. The form 
chosen to represent the del operator also depends on the coordinate system 
used. In an orthogonal coordinate system (u,v,w) with unit vectors 
€,, Cy, Cw and scale factors hy, hy, hy, the del operator is given by 
v 1 0 x 1 0 x 1 Oo 
=e, — —+e—— +e, — —. 
“hy Ou "hy Ov hy Ow 
When this operator acts on a scalar field V(u,v,w) expressed in (u,v, w) 
coordinates, it produces the gradient vector field 
BN ee 1 OV - 1 OV 
hy Ou "hy Ov hy Ow’ 


in agreement with equation (35). 


(40) 


VV =e, 


Notice that the unit vectors in equation (40) have been placed to the left of 
the partial derivative operators 0/Ou, 0/Ov and 0/Ow. This is a necessary 
precaution because the unit vectors generally depend on the coordinates 
(u,v, w). If they were placed to the right of the partial derivative 
operators, we would need to differentiate them, and this would not give the 
correct gradient vector field. Cartesian coordinates are a special case 
because the unit vectors i, j and k are all constant vectors, so they can be 
placed either to the left or to the right of 0/Ox, 0/Oy and 0/0z. 


3 The divergence of a vector field 


Because the del operator contains unit vectors, it should be thought of as 
having a vectorial (rather than scalar) character. It is sometimes described 
as being a vector differential operator. That is why it is printed in bold. In 
written work, you should underline it with a wavy or straight line. 


Exercise 18 


Use equation (40) and the scale factors of Table 1 to express the del 
operator in polar coordinates, cylindrical coordinates and spherical 
coordinates. 


3 The divergence of a vector field 


For a scalar field V(a, y, z), the three partial derivatives 


OV OV OV 

Ox’ Oy’ Oz 
describe its spatial rates of change in the z-, y- and z-directions. However, 
you have seen that it is useful to group these three partial derivatives 
together to form the gradient field VV. Rather than thinking about the 
separate partial derivatives, we can think about the gradient field, which 
has a magnitude and direction at each point. This is a powerful idea — you 
have seen that the gradient field allows us to calculate the spatial rate of 
change of V in any direction (not just the coordinate directions). 


The rest of this unit discusses the spatial rates of change of vector fields. 
A three-dimensional vector field F(x, y,z) has three components, F,, Fy 
and F,, so there are nine partial derivatives to consider at each point: 


GF, OF, OF, OF, OF, OF, OF, OF, QF; 
Ox’ Oy’ Oz’ Ox’ Oy’ Oz’ Ox’ Oy’ Az 

This is a great deal of information, making it hard to visualise what. is 

going on. Fortunately, the nine partial derivatives can be grouped into two 

significant quantities — called the divergence and the curl of the vector 

field. In most cases, these two quantities tell us all we need to know about 

the spatial rates of change of a vector field. 


The fact that the del operator V has a vectorial character suggests two 
ways of grouping the partial derivatives. Given two vectors a and b, we 
can define two different types of product. The scalar product is 


a+ b= agby + ayby + azbz, 
and the vector product is 


ax b= (dybz = azby) i+ (azbe _ dxbz) j a (Gaby a (ry be) k. 
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For the del operator V acting on a vector field F(x, y, z), we can introduce 
the corresponding combinations 
OF, . Ofy . OF; 


se Ox “Oy Oz 


and 


_ (OF. OF,\,. (OF, OF:\,. (OF, OFe 
vxr= (Se -Se)ig (SE) 54 (Se  )k 


It turns out that these are precisely the combinations that we need: V+ F 
is called the divergence of F, and V x F is called the curl of F. This 
section discusses divergence, while Section 4 discusses curl. 


3.1 Divergence in Cartesian coordinates 


The above discussion gave a broad overview. Here we begin afresh with 
the basic definition of divergence in Cartesian coordinates. 


Divergence of a vector field in Cartesian coordinates 

Suppose that a vector field F is expressed in Cartesian coordinates as 
ee eee 

Then the divergence of F is defined as 


OF, OF, oF, 
Ox Oy Oz 


The alternative notation div F is sometimes used for divergence, so we 
can also write 


(41) 


Cee Clee Mela clas 
div F = ae oa By == Ay” (42) 


Two questions naturally arise: how is divergence calculated, and what does 
divergence tell us? We begin with the calculations, and then discuss the 
interpretation. 


Calculating divergence 


It is easy to calculate the divergence of a vector field F in Cartesian 
coordinates. All you need to do is to identify the Cartesian components 
F,, Fy and F, of the field, find their partial derivatives with respect to the 
corresponding coordinates x, y and z, and add the results together. 


We begin with two-dimensional vector fields because they are simpler to 
visualise. 


Example 6 
Find the divergence of each of the following vector fields: 


A=3i4+2j, B=yi-zj, C=zi+yj, D=27i-yj. 


3 The divergence of a vector field 


Solution 


Using the definition of divergence (equation (41)), we get 


_ 08) , a2) , a(0) 
Vee ag oa ae 


_ Hy) , =x) | 20) _ 
eon tay ae ~° 
_ fa?) | Av?) , 0) 

Ox Oy Oz 


= 0, 


= 2(@—y). 


This example illustrates some cases that can arise. For vector fields A 
and B, the divergence is equal to zero everywhere. The vector field C has 
a constant non-zero divergence, and D has a divergence that varies with 
position, which is the most usual situation. The divergence is always a 
scalar function of position (which may be a constant or zero). 


It is important to understand the distinction between gradient and 
divergence. Given a scalar field V, we can construct its gradient VV, 
which has vector values: 


OV OV OV 
VV = —i+—ji—k. 
Ox Bs Oy J+ Oz 
Given a vector field F, we can construct its divergence V - F, which has 
scalar values: 


Ox Oy Oz” 


The gradient is a vector, so its expression involves unit vectors. By 
contrast, divergence is a scalar, and its expression is just a sum of 
derivatives with no unit vectors. When you specify V - F in handwriting, 
you must underline both V and F and include a dot between them — 
otherwise your reader may think that you are referring to the gradient of a 
scalar field F. 


Exercise 19 

Calculate the divergence of each of the following vector fields. 
(a) F(a, y,z) = x*yit y?zj —y2?k 

(b) G(a,y, 2) = (w+ y)?it (yt 2)? f+ (e+z)*k 


Exercise 20 

A scalar field V takes the form V = x4 + y44 24. 

(a) Would it make sense to take the divergence of this field? 

(b) Find the gradient field VV, and calculate its divergence, V - (VV). 
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In fact, J is the fluid density 
times the fluid velocity at each 
point. In this context, J is not 
the Jacobian vector of Unit 8. 


The cube should be tiny: 
strictly speaking, we are 
interested in the limiting case 
where its size tends to zero. 


166 


Interpreting divergence 


At any given point (x,y,z), the divergence of a vector field has a definite 
scalar value. In fact, we can go further. 


It turns out that the divergence of a vector field is a scalar field. 


This is a important fact. Recalling the definition of a scalar field given in 
Subsection 1.1, it means that the value of the divergence at any given 
point is independent of the orientation of the coordinate system. This 
suggests that the divergence of a vector field might describe some 
significant property of the vector field. Indeed it does, and the name 
divergence provides a clue. 


Intuitive meaning of divergence 


The divergence of a vector field F describes the extent to which F 
flows outwards or diverges from each point. 


This statement is not precise. Deciding how to quantify the ‘extent of 
outward flow’ and linking this to the definition of divergence in 

equation (41) involves a lengthy discussion, and the details are left for the 
next unit. For the moment, we just consider a few typical examples. The 
aim is to give you an intuitive feeling for divergence so that you can 
interpret the results of your calculations. 


Consider a vector field J(x,y,z) that describes the flow of mass in a fluid. 
The fluid could be water or air, and it may have a fixed or variable density. 
The precise definition of J is not needed in this informal discussion. 


At a point P, the value of V - J tells us about the net flow of fluid towards 
or away from P. If we draw a tiny cube around P, then the divergence of 
J at P is: 


e positive if more fluid leaves the cube than enters it 
e negative if more fluid enters the cube than leaves it 
e equal to zero if the outflow exactly matches the inflow. 


Similar remarks apply to two-dimensional velocity fields, but with the cube 
replaced by a square. 


These ideas can be illustrated with the vector fields of Example 6. Arrow 
maps for the fields A = 3i+2j and B = yi— «xj are shown in Figure 26, 
with selected points indicated by blue dots and small blue squares drawn 
around them. Example 6 showed that these fields have zero divergence 
everywhere. So according to our interpretation of divergence, there should 
be no net flow into or out of these squares. So far as it is possible to tell, 
the arrow maps confirm this interpretation. 


3 The divergence of a vector field 


YA YA 
AAAAAAAA NNN 
A >a —— 
A Pt PF sauN ‘ 
AA AAA ~>~arrx ‘ 
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(a) (b) 


Figure 26 Arrow maps for the vector fields A and B of Example 6 


Arrow maps for the vector fields C = xi+ yj and D = x7i— y’j of 
Example 6 are shown in Figure 27. Example 6 showed that C has a 
divergence that is equal to 2 everywhere. This positive divergence 
corresponds to the fact that there is a net flow out of the blue square in 
Figure 27(a). The vector field D has a divergence equal to 2(2 — y), which 
is positive for « > y and negative for x < y. Our interpretation of 
divergence is supported by Figure 27(b), which shows a net flow out of the 
lower square, and a net flow into the upper square. 
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Figure 27 Arrow maps for the vector fields C and D of Example 6 
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Exercise 21 
(a) Calculate the divergence of the vector field 


x 7 y : 
= i+ Si (ey FO): 
/~2 + yp? /q2 4 yp? 
(b) The arrow map for F is shown in the margin. Use the square ABCD 
to show that this diagram supports our interpretation of divergence. 
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Figure 28 The electric field 
around positive and negative 
charge distributions 
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Divergences of vector fields in the real world 


Not all vector fields describe flows. A gravitational field, for example, 
quantifies the gravitational influence of massive objects. The 
divergence of a gravitational field turns out to be negative at points 
occupied by matter (inside the Earth, for example, or inside the Sun). 
In empty space, the divergence of a gravitational field is equal to zero. 


A similar situation applies to electric fields (Figure 28). The 
divergence of an electric field is positive at points where there is 
positive charge, and negative at points where there is negative charge. 
In empty space, the divergence of an electric field is equal to zero. 


Magnetic fields are unusual: so far as we know, the divergence of any 
magnetic field is equal to zero everywhere. 


3.2 Divergence in non-Cartesian coordinates 


You have seen that some vector fields are best described in non-Cartesian 
coordinate systems. For example, the gravitational field around the Sun 
points radially inwards towards the Sun, and is most naturally described in 
spherical coordinates. We often have to calculate the divergence of such 
vector fields, so we need to know how to calculate divergence in 
non-Cartesian coordinates. Polar, cylindrical and spherical coordinates are 
all important in applications. All of these are orthogonal coordinate 
systems. 


In an orthogonal system of coordinates (u,v, w) with unit vectors e,, ey 
and e,,, a vector field F can be expressed as 


F = Fyeu + Fy ey + Fu ew, 


and the del operator is given by 


v 1 0 i‘ ae 6, Ld 
=e, — —+e— — +e, — —. 
“hy, Ou "hy Ov hwy Ow 
It follows that the divergence of F in the (u,v, w) coordinate system is 
1 0 1 O Lo 
V- FH (e—-—— +e-— 7 te — 
(pat ” hie Bn. "hy Ow 
This formula is correct, but is not immediately useful because there are 
still partial derivatives and scalar products to evaluate. The situation is 
complicated by the fact that unit vectors such as e, may depend on the 
coordinates (u,v, w). So when the partial differentials act on the 
right-hand bracket, the unit vectors e,, e, and e, must be differentiated 
as well as the components Fi, F, and F,,. The calculations get very messy! 


) ent Frey + Frew). 


In practice, scientists and mathematicians do not spend time deriving 
general formulas for divergence in non-Cartesian coordinates. Life is too 
short, so they simply look up the results they need in reference works. 


3 The divergence of a vector field 


We take a similar attitude. The Appendix to this unit justifies the 
expressions that we will use, but this is optional material, and will not be 
assessed or examined. We focus here on the more important (and more 
straightforward) task of stating and applying standard formulas. 


Equation (35) gave a general formula for the gradient of a scalar field in 
any orthogonal coordinate system. Rather wonderfully, in spite of the 
complications noted above, there is a corresponding formula for the 
divergence of a vector field in any orthogonal coordinate system. 


Divergence of a vector field in orthogonal coordinates 


In any orthogonal coordinate system (u,v, w) with scale factors hy, hy 


and hy, a vector field F has divergence In a two-dimensional orthogonal 
vepa3[2(JR), OI), 0 (SR) gy) Spear ele te 
J | Ga\ Men Ov\ hy Ow\ Rhy omitted, and J = hyhy. 
where 
all = Delta! Deas 
is the product of the scale factors, called the Jacobian factor. J also appeared in Unit 8 in the 


context of volume integrals. 


It is easy to check that this formula gives the correct result in Cartesian 
coordinates (u,v,w) = (x,y,z). In this case, all the scale factors are equal 
to 1, so J = 1 and equation (43) reduces to 

OF; OF y., OF, 

Ox Oy Oz’ 


as expected. 


In polar coordinates (u,v) = (r, ¢), the scale factors are h, = 1 andhg=r, These scale factors are given in 
so J =r. In this case, equation (43) gives the following formula. Table 1. 


Divergence in polar coordinates 
_16¢F,) , 1 OF; 
NS a ene 


(44) 


When there is rotational symmetry, it is often best to specify 
two-dimensional vector fields in polar coordinates, and to calculate their 
divergences using equation (44). 


Example 7 


Find the divergences of the following two-dimensional vector fields, 
expressed in polar coordinates. 


(a) F(r, ¢) = ee (r # 0) 
(b) G(r, ¢) =re,+rsingeg 
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Solution 
(a) The polar components of F are F; = 1 and Fy = 0, so 


_ for) 1 
ViF=-S4=- (r#0). 


This is the same field as that in Exercise 21. The calculation is much 
easier in polar, rather than Cartesian coordinates! 


(b) The polar components of G are G, = r and Gg = rsin¢, so 
1 O(r?) e 1 O(rsin >) 


a Or r Od 


= 2+ cos ¢. 


Similar methods apply in three dimensions. In cylindrical coordinates 
(u,v, w) = (r,¢, 2), the scale factors are h, = 1, hg =r and hz = 1, so 
J =r. In this case, equation (43) gives 


_1/OrF,) | OF , rF:) 
V-F==( ar ao az ): 


Since r is treated as a constant when partially differentiating with respect 
to z, we have the following result. 


Divergence in cylindrical coordinates 
SPOGr Ve lOn.. 20r 
VET Or fr Ob Oz 


, (45) 


Not surprisingly, this is similar to the expression for divergence in polar 
coordinates, but with an additional term, 0F./0z. 


In spherical coordinates, the scale factors are h, = 1, hg =r and 
hg =rsin@, so J = r? sin @. Hence 
V-F= if Ge sin 0 F,.) i O(r sin 0 Fo) “f ae | . 


r2 sin @ 


Or oo Og 


Remembering that functions of one variable are treated as constants when 
partially differentiating with respect to another variable, we get the 
following formula. 


Divergence in spherical coordinates 
WG) in 1 O(sin@ Fy) 1 OF 
Tr OF rsin 0 06 rsin@ O¢— 


Walt = (46) 


You can take all these results on trust. Equations (44)—(46) are all listed in 
the Handbook, so you need not memorise them for the exam. For more 
general purposes, it is worth trying to remember equation (43), which has 
a more symmetrical shape than the others, and can be used to construct 
them all. 


4 The curl of a vector field 


The focus here is on using equations (44)—(46) to find the divergences of 
given vector fields. You will generally be told which coordinate system is 
used, so you just need to select the appropriate formula for divergence, 
carry out the partial differentiations, and simplify the result if possible. 


Exercise 22 
The following fields are in cylindrical coordinates. Find their divergences. It is essential to state which 
. coordinate system is used 
(a) F(r,¢,z) = 4r° e, (b) G(r, ¢,z) = r*sin ge, + 27 e, because the coordinate r has 
different meanings in the 
Exercise 23 cylindrical and spherical 
systems. 
The following fields are in spherical coordinates. Find their divergences. 
(a) F(r,0,¢) = 4r°e, (b) G(r,6,¢) = rsin? 0 e9 + r cos 0 cos $ eg 
Exercise 24 
In spherical coordinates, a vector field F takes the form F = f(r) e,, where 
f(r) depends only on r, the distance from the origin. If div F = 0 at all 
points except the origin, show that f(r) is proportional to 1/r?. 
4 The curl of a vector field 
The second important quantity describing the spatial rate of change of a 
vector field F is its curl. In this section, we introduce curl in Cartesian 
coordinates and illustrate its physical meaning with some typical vector 
fields. We also show how curl is calculated in non-Cartesian orthogonal 
coordinate systems. 
4.1 Curl in Cartesian coordinates 
We briefly mentioned curl when considering ways in which the operator V 
can act on vector fields. Here we begin afresh with the basic definition. 
Curl of a vector field in Cartesian coordinates 
Suppose that a vector field F is expressed in Cartesian coordinates as As always when dealing with 
: 2 vector products, we assume that 
F= F,i+ Fyjt+ Fk. the coordinate system is 
Then the curl of F is defined as enishande’ 
i jie 
CeCe 
Vig a AT 
Ox Oy Oz ey 
HO 1 


The alternative notation curl F is commonly used. 
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By contrast, the divergence of a 
vector field is a scalar field. 


Figure 29 The right-hand 
grip rule 
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In this definition, the partial derivative operators in the second row act on 
the components in the third row. The determinant can then be expanded 
in the usual way. For example, the term involving i is 


_( 9 0..\ (dF. OF,\, 
(Ssh) -(5 a) : 


The complete expansion gives the vector quantity 


OF, Oly \ « OF, OF 2 \-. OF, OF; 
-_— - —_ k. (4 
Oy aL (> =) oe =) Si 


It is worth comparing the definitions of divergence and curl in 

equations (41) and (48). The derivatives that occur in divergence may be 
said to ‘go with the components’ — for example, 0/0x acts on F),, and so 
on. Curl is quite different: it is built up of partial derivatives such as 

OF, /Oy and OF,,/Ox that describe how rapidly a component in one 
direction changes when we move in a perpendicular direction. In three 
dimensions, there are six such derivatives, and these are arranged in pairs 
to give the components of V x F shown in equation (48). 


vxF=( 


We will calculate curls shortly. This is a straightforward task — we just 
need to calculate and combine the appropriate partial derivatives. Before 
doing this, let us see what curl means. Clearly, at any given point, V x F 
has a definite value. In fact, we can go further. 


It turns out that the curl of a vector field is another vector field. 


From the definition of a vector field (see Subsection 1.1), this implies that 
at any given point, the magnitude and direction of V x F are independent 
of the orientation of the coordinate system. This suggests that V x F 
might have some significant physical meaning. This is indeed the case, and 
the name curl provides a good clue. 


Intuitive meaning of curl 


The curl of a vector field F describes the extent to which F rotates or 
swirls locally about each point. 


In three-dimensional space, rotation involves an axis of rotation and a 
sense of rotation about that axis. Taking the coordinate system to be 
right-handed, these features are related to curl in a simple and direct way. 


e  =6The axis of local rotation is along the direction of the curl vector. 


e The sense of rotation around this axis is found by the right-hand 
grip rule illustrated in Figure 29. With the thumb of your right hand 
pointing in the direction of V x F, the fingers of your closed right 
hand indicate the sense of rotation associated with F. 


Establishing a precise link between the concept of ‘local rotation’ and the 
definition of curl is left for the next unit. For the moment, we just consider 
a few typical examples. The aim is to give you an intuitive feeling for curl 
so that you can interpret the results of your calculations. 


First, we consider two-dimensional vector fields. A vector field in the 
xry-plane takes the form 


In this case, V, = 0, OV,/0z = 0 and OV, /0z = 0. Substituting into 
equation (48), we see that the curl of V has only one component: 


OV, Oey 
Ox Oy 


One way of interpreting such a curl is to suppose that V describes the 
velocity of water on the surface of a river. Imagine a tiny circular disc 
floating on the water. The disc will be carried downstream following the 
direction of flow, but it may also rotate about a vertical axis as it drifts. 
The curl of V is proportional to the rate of rotation of the disc. Of course, 
the disc just serves as a marker making the curl of the underlying vector 
field visible — we are not really interested in the disc itself. 


vx v=( (49) 


To take a specific example, Figure 30 shows a straight stretch of river with 
its banks at y = —4 and y = 4 (in metres), with water flowing in the 
x-direction. Suppose that the velocity of water on the surface of this river 
(in metres per second) is given by the vector field 


V=(16—y*)i (-4<y<4). (50) 
Then the curl of this vector field is 
OV, OV, 
1V = (| — - —|k=2yk Gh 
curly = (Se - 52) k= 24k, (51) 


which varies from point to point. Remember that our conventions require 
us to use right-handed coordinate systems, so with the x- and y-axes as 
shown in Figure 30, the z-axis points out of the page, towards you. 
Bearing this in mind, the interpretation of curl can be checked as follows. 


Suppose that a disc is placed at point A in Figure 30, equidistant from 
either bank. At this point y = 0, and equation (51) gives zero curl. This 
makes good sense because water flows symmetrically around the disc, 
producing no tendency to rotate one way or another: the disc drifts 
downstream without rotating. 


At point B, y = 2 and the calculated curl points along the positive z-axis. 
Using the right-hand grip rule, this is associated with a rotation about a 
vertical axis in an anticlockwise sense when viewed from above the river. 
This correctly describes how the disc revolves in response to a current that 
is stronger at the centre of the river than near its banks. At point C, 

y = —2 and this conclusion is reversed. The curl now points along the 
negative z-axis, corresponding to a clockwise rotation of the disc when 
seen from above, which is again what we should expect. 


4 The curl of a vector field 


> 


Qo-+—>_—>_—_-A >_> 


—4 


Figure 30 The flow of water 
in a river 


Even a straight-line flow may be 
associated with rotation and 
curl. 


173 


Unit 9 Differentiating scalar and vector fields 


Figure 31 A swirling flow 


Figure 32 An arrow ona 
very small disc points in a 
fixed direction as the disc 
drifts in the flow of 
Exercise 25 
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A good example of a swirling flow is given by the vector field 
V=-yi4+ vj, 


whose arrow map is shown in Figure 31. We have 


curl V = (2 = ay k = 2k, 


which is in the positive z-direction (i.e. out of the page towards you). 


The result can be interpreted using the right-hand grip rule. With the 
outstretched thumb of your right hand pointing in the z-direction, your 
fingers wrap in an anticlockwise sense. This is as expected: if Figure 31 
represents a flow of water, then a float placed in the centre of this flow 
would certainly revolve anticlockwise. In fact, curl V is a constant in this 
case, so the float would revolve anticlockwise, at the same rate, no matter 
where it was placed in the flow. This happens because water flows 
unsymmetrically around the float, producing an effect similar to that 
already described for the straight-flowing river of Figure 30. 


It may be tempting to suppose that any vector field with field lines that 
are closed loops has a non-zero curl. This is not true, as the following 
exercise shows. 


Exercise 25 


The vector field 
Yy : v F 
ge 1+ Pie (2, y) 4 (0,0)) 
has the arrow map sketched in the margin. Show that the curl of V is 
equal to the zero vector at all points in the domain of V. 


The result of this exercise does not contradict our statement that curl 
describes the local rotation of a vector field. The important word here is 
local. If the field in Exercise 25 describes the two-dimensional flow of 
water, and a tiny disc is placed on the surface of the water, then the disc 
will travel around the origin in circles, following the circular field lines. 
However, for this particular flow, the disc does not rotate locally. If the 
disc is marked with an arrow that initially points East, the arrow continues 
to point East as the disc drifts in the current, as shown in Figure 32. This 
absence of local rotation agrees with the calculation of zero curl. 


Curl in three dimensions 


So far we have considered the curls of two-dimensional vector fields. More 
usually, we need to find the curl of a three-dimensional vector field. The 
interpretation is essentially the same. If a fluid has a velocity field V, then 
a sphere immersed in the fluid revolves about an axis aligned with the curl 
of V, and the sense of rotation is given by the right-hand grip rule. 


To calculate the curl of any three-dimensional vector field F, we start from 
the definition: 


4 The curl of a vector field 


ijk 
ad 0 2 
F, Fy F: 


It is a good idea to write down this equation at the start of any calculation 
of curl in three dimensions. The alternative is to write down the expanded 
version of the determinant, 


_ (OF: OFy\,. (OF, OF.\,. (OF, Fr 
vx r= (Se S)in (ia (- )k 


but this equation is harder to remember. If you want to use it as your 
starting point, it is helpful to note that it contains a strong pattern, based x 
on the cyclic ordering shown in Figure 33. 


e The x-component is obtained by acting with 0/Oy on F, (note the 
order x > y > z); a term with y and z interchanged is then 


subtracted. z Sw 


e The y-component is obtained by acting with 0/0z on F), (note the 


order y > z > x); a term with z and x interchanged is then Figure 33 A cyclic ordering 
subtracted. of x, y and z underlies the 
formula for curl in Cartesian 
e The z-component is obtained by acting with 0/Ox on Fy (note the coordinates 
order z > x > y); a term with x and y interchanged is then 
subtracted. 
Exercise 26 


Find the curls of the following vector fields. 
(a) F=(z¢-y)i+ (@-2z)j+(y-z)k (b) G = zy*i+ 2z7j+yr7?k 


Exercise 27 
Given a scalar field U(x, y, z), the corresponding gradient vector field is You may assume that U varies 
smoothly enough to obey the 
VU = ou i+ ou 5+ ou k. mixed partial derivative theorem 
Ox Oy Oz of Unit 7. 


Show that the curl of this gradient vector field vanishes everywhere. 


Curls of vector fields in the real world 


Many fluid flows have a swirling motion, known as a vortex. Very 
often, the curl of the velocity is small over large regions, and is 
significant only in a small central region called the vortex core. 


An aircraft in flight continually generates a vortex from each of its 
wing tips, as illustrated in Figure 34. This is an inevitable 
consequence of the air flows that generate lift, but introduces 
unwelcome drag. To improve fuel economy, wings are designed to Figure 34 Vortices shed 
minimise the energy that is wasted by generating such vortices. from an aeroplane’s wing tips 
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Figure 35 Migrating geese 
maintain a V-formation 
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You may have seen birds such as geese flying in a V-formation 
(Figure 35). This is also an adaptation to the generation of vortices 
at wing tips, because the lagging birds benefit from the 
upward-flowing air in vortices generated by the leading bird. Fair play 
is observed, as the birds regularly switch positions in the formation. 


Electric and magnetic fields can also have curl, and this is vital for 
many phenomena. A magnetic field that changes with time produces 
an electric field with a non-zero curl, and this lies behind the 
functioning of electricity generators. Moreover, an electric field that 
changes with time produces a magnetic field with a non-zero curl. 
These facts underpin the interpretation of light as a travelling 
disturbance of electric and magnetic fields. 


4.2 Curl in non-Cartesian coordinates 


You have seen that some vector fields are best described in non-Cartesian 
coordinate systems. This subsection explains how to calculate the curl of a 
vector field expressed in any orthogonal coordinate system. 


In an orthogonal system of coordinates (u,v, w) with unit vectors e,, ey 
and e,, a vector field F can be expressed as 


F = Fyey + Fy ey + Fy ew, 


and we know that the del operator can be written as 


V=e ee ee A. 
"hh, Ou "hy Ov "hy Ow’ 


It follows that the curl of F in the (u,v, w) coordinate system is 


VxF= (c. - < + Gy i < + Cy) - x) xX (Fy eu + Fy ey + Fu ew) - 
This equation is correct, but is not immediately useful because there are 
still differentiations and vector products to carry out. We take a similar 
approach to that adopted earlier for divergence. The optional Appendix 
justifies the expressions that we use, but we just state the standard 
formulas here. Your task is to apply these formulas to specific vector fields. 


Curl of a vector field in orthogonal coordinates 


In any orthogonal right-handed system of coordinates (u,v, w) with 
scale factors h,, hy and hy, a vector field 


F = Fey + Fy ey + Fu Cy 


4 The curl of a vector field 


has curl 
Rueu hyey hw ew 
ii] @ O O : ; 
VxF=-— ; (52) When the determinant is 
J| Ou dv Ow expanded, partial derivative 
Malin, Mali Mandan operators in the second row act 


on elements in the third row. 
where J = h,hyhy is the Jacobian factor. sca ore 


The requirement for the coordinate system to be orthogonal and 
right-handed implies, for example, that e, x ey = ew (rather than —e,). 
All the coordinate systems (u,v, w) used in this module are right-handed. 
For example, in spherical coordinates (r,0,¢), we have e, x eg = eg. 


It is easy to check that equation (52) works in Cartesian coordinates. In 
this system, all the scale factors are equal to 1, so J = 1. Moreover, e, = i, 
e, =j and e, =k, so we recover 


ij k 
a a 0 
i oF, 


In cylindrical coordinates (r,, z), the scale factors are h, = 1, hg =r and 
h, =1,so J=r. We therefore obtain the following result. 


Curl in cylindrical coordinates 


Note that the scale factors 


er Treg e, 
accompany the unit vectors in 


Vea i o a o : (53) row | and the components in 
7\\Or Ca Ox row 3. Also, take care to include 
Fo 7h, Fy the overall factor 1/J, i.e. 1/r in 
this case. 
Example 8 


The vector field F = r? eg is in cylindrical coordinates. Find its curl. 
Solution 
The field F has cylindrical components F; = 0, Fy = r?, F, =0, so 


er Teg e, 


1|0 0 O 
0 r 0 
= : (e, (0) — reg (0) + e, (3r7)) = Brey. 
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Exercise 28 
The following vector fields are in cylindrical coordinates. Find their curls. 
(4) F=r" 6, (b) G=rzeg (c) H=rzsin¢e, 


In two dimensions, polar coordinates are similar to cylindrical coordinates, 
but there is no z-dependence. This means that we can get the expression 
for curl in polar coordinates by expanding equation (53), bearing in mind 
that F, = 0, and F, and Fy are independent of z. Setting F, = 0 gives 


e; Teg e 


ile o 6 
F, rFy 0 


Then, using OF,,/0z = 0 and OF y/0z = 0, we get the following result. 


Curl in polar coordinates 


= 1 O(r Fo) OF, 
vxF=2( ay a) ee (54) 


Exercise 29 


In polar coordinates, the vector field in Exercise 25 can be expressed as 


F = =e, (r £0). 


Use equation (54) to confirm that the curl of this field is equal to the zero 
vector. 


The final coordinate system that we need to consider is spherical 
coordinates (r,6,@). In this case, the scale factors are h, = 1, hg =r and 
hg =rsin@, so J = r? sin @, leading to the following result. 


Curl in spherical coordinates 


e, reg rsinfeg 
il @ @ O 
= a — ' 
a r2sin@0|Or 00 0d eo 
Ff. rFo rsnd Fy 


Exercise 30 


The following vector fields are in spherical coordinates. Find their curls. 
(a) F=reg (b) G=rsindeg (c) H=r’e, 


Learning outcomes 


After studying this unit, you should be able to do the following. 


e Define the terms scalar field and vector field. 


e Interpret contour maps of scalar fields, and interpret arrow maps and 


field line maps of vector fields. 


e Convert a scalar or vector field expressed in Cartesian coordinates into 


polar, cylindrical or spherical coordinates. 


e Given a scalar field, calculate its gradient field in Cartesian, polar, 
cylindrical or spherical coordinates. 


e State and apply the properties of gradient fields. 


e Given a vector field, calculate its divergence in Cartesian, polar, 
cylindrical or spherical coordinates. 


e Given a vector field, calculate its curl in Cartesian, polar, cylindrical 
or spherical coordinates. 


e Relate arrow maps of vector fields to their divergences and curls. 


Appendix: proofs of results for div 
and curl 


This optional Appendix is neither assessable nor examinable. Its aim 
is to justify the general formulas for divergence and curl in orthogonal 
coordinates given in equations (43) and (52). Because these proofs are 
difficult to find elsewhere, we include them for reference purposes and 
general interest. However, this material is more demanding than the 

general level of this module, so do not be dismayed if you find it hard. 


Consider a general orthogonal coordinate system with coordinates 
(u,v, w), unit vectors ey, e, and ey», and scale factors hy, hy and hy. In 
such a system, a vector field F is written as 


F = fy, ey + fy ey + Fu ew, 


and the del operator is 


1 O ie 1 0 1 Oo 
— — +e), — — + ey — 
hy, Ou "hy Ov ” hy Ow 

Using these expressions, we can construct the divergence and curl in the 
usual way: 


divF =V-F and curlF=V xXF. 


Learning outcomes 


Do not study this Appendix at 
the expense of other units. It 
can be read for interest when 
you have the time (perhaps after 
completing the module). 
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In practice, however, these expressions need to be unpacked. The unit 
vectors may vary from point to point, and the effect of partial derivative 
operators such as 0/Ou on the components and the unit vectors of F must 
be worked out. The main text skipped directly to the final results, namely 
equations (43) and (52), but no proofs were given. The missing proofs are 
contained in this Appendix. 


Divergence and curl in polar coordinates 


Before looking at the general problem, it is helpful to consider a specific 
case: the expressions for the divergence and curl of a two-dimensional 
vector field in polar coordinates (r, @). In this case, the scale factors are 
hy = 1 and hg =r, and the divergence is 


QO 1 0 

V-F= —+-e,— 

(c, Or | PP OO 

The first task is to carry out the partial differentiations. One very 


important fact must be understood: the unit vectors e, and eg are not 
fixed, but vary with position. In fact, you saw in equations (11) that 


) : (F;, e, + Ee eg): (56) 


e, = cos di+ sin dj, 
eg = —singi+ cos dj. 


Partially differentiating these equations with respect to ¢, we obtain 


« = —singi+ cosdj = eg, 
deg mT 
Ob = —cosgi-—singj = —e,. 


Using these results, and noting that the unit vectors do not depend on r, 
the derivatives that appear in equation (56) can be evaluated as follows: 


a) _ OF, OFg 
Berner tgey) Sa. ee Pei 67) 
O _ OF, de, . OFZ eg 
5g Fr er + Feeg) Get a6 | Ob est Fo ae 
OF, OF¢ 
= gg egg. Oe (58) 


To complete the evaluation of equation (56), we must first take the scalar 
products of equations (57) and (58) with respect to e, and eg/r, and then 
add the results. Since e, and eg are orthogonal unit vectors, we get 


0 OF, 
ers a (Fr er + Foes) = a 
1 a 1 OF; 
ree pg Fr er + Feeg) = a (r+ st) ‘ 


Adding these results, and using the product rule of differentiation, we get 
OF, i F,. ” LOR,  1O¢VF,) , 1 OF 
Or r rr 0b or Or r Ob’ 


which confirms equation (44), a special case of equation (43). 


V-F= 


Appendix: proofs of results for div and curl 


A similar argument can be used for curl in polar coordinates. We start 
from the expression 


QO il O 
VxF= (5 +r 5) xX (Fe, + Fy eg), 


which is like equation (56), but with a vector product rather than a scalar 
product. 


Using equations (57) and (58), we get 


FE. OF 
VxPoex (Se a es) 


Or Or 
1 OF. OF 
—eg X Fre — Fye, ). 
+ = ey (ee gor pao ao © o~ Foes] 
We now evaluate the vector products. The vector product of any vector 
with itself is equal to zero, so e, X e; = 0 and eg X eg = 0. Also, as shown 


in Figure 36, the unit vectors e,, eg and e, (in that order) form a 
right-handed system, so 


er Xeg=e, and eg X er = —ey. 


Multiplying out the brackets and using these results in our expression for 
the curl, we conclude that 


ee 3) F. 
VxF= (= _19F, +4 2) = : (“ a) - e., Figure 36 The unit vectors 
Or r Oo r r Or Oo e;, eg and e, forma 
which confirms equation (54), a special case of equation (52). right-handed system 


General proof of the divergence formula 


We now generalise to any orthogonal coordinate system. For this purpose, 
it is helpful to use a slightly different notation in which we refer to 
coordinates u;, with unit vectors e; and scale factors h;, where the index 7 
can be 1, 2 or 3. The unit vectors are mutually orthogonal and 
right-handed so that, for example, e; eg = 0 and e; X eg = es. 


In this notation the del operator and a vector field F can be written 
compactly as 


V=yere and F=)° fei, 


where it is understood that the sums range from i = 1 toi = 3. 


The divergence of F in orthogonal coordinates can then be written as 


=| Deg, in (ee a). (59) 


Notice that we have used the index 7 in the first summation, and the 
index 7 in the second summation. This is an essential precaution. In any 
single summation it makes no difference whether we call the index 7 or j, 
but when we combine two summations in the same formula, we would run 
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into problems if we used the same index throughout. We would be in 
danger of leaving out terms such as 


1 a 
I ou 


where the two indices take different values. 


Multiplying out the brackets in equation (59) gives 
on ei) 
V-F= ——. 
Ee 


which is a sum of nine terms corresponding to 7 = 1,2,3 and 7 = 1, 2,3. 


The crucial step is the evaluation of the derivatives, especially the 
derivatives of the unit vectors. When we considered the case of polar 
coordinates, the unit vectors e, and eg were known, so their derivatives 
could be found explicitly. Now we need a route that works more generally, 
in any orthogonal coordinate system. 


The key is to remember the relationship between unit vectors and tangent 
vectors given in equations (25) and (26). In our present notation, 


1 Ox Oy Oz 
— T; here T; = | ——, —— 

h; 7 ee (=. Ou,’ — 
The mixed partial derivative theorem of Unit 7 tells us that it does not 


matter which order is used to carry out two partial differentiations in a 
second-order derivative. Hence 


OT; O (Ox Oy Oz\ Of 0x Oy Oz\ _ OT; 
Ou; - Ou; Ou,’ Ou,” Ou; a Ou; Oty: Ou;’ Ou; a Ou; , 
This equation implicitly contains all the information that we need about 


the spatial rates of change of the unit vectors. To take advantage of it, we 
write equation (59) in terms of the tangent vectors, giving 


Fi 
F= T; ). 
VPS eb a (ED) 
Using the product rule of differentiation, we then have 
1 O (F; F; OT; 
-F= T;-T; T; 
™ Dg so (F ‘) ite | 
so applying equation (60), we get 
O (F; F, OT; 
F= T;-T; T; Jy. 61 
Vv" La Ea :) its od en) 


This equation is ripe for simplification! In an orthogonal coordinate 
system, the tangent vectors are orthogonal, so the scalar product in the 
first term is equal to zero unless 7 = 7, in which case it is equal to 

T;-T; =|T;|? = h?. The scalar product in the last term can be written as 
OT; 10(T;-T;) 1 On Oh; 


Ou; ~ 2 Ou; == Dus lly Ou; 


Ee; = 


(60) 


Ty 


Appendix: proofs of results for div and curl 


Using these results in equation (61), we obtain 


a 1 Oh; 
ee m(E+tE Lo ig Bi ; (62) 


This can be tidied Bs by noting that 


1 Oh; 
lihy= hihoh 
rs hj Ouj ~ Buy x2 . Inala): 
Then, | the ne factor J = hyh2hs, we get 
1 hj 1 
ar O myat OJ . 
hj Ou; -= J OU; 


poate to equation (62) and using the product rule of differentiation, 
we conclude that 


v-e= Lala) * Ce) Gan) 
3 oes 1), 


which is the required result (equation (43)). 


General proof of the curl formula 


We again consider orthogonal coordinates u;, with unit vectors e; and scale 
factors h; (for 7 = 1,2,3). The coordinate system is assumed to be 
right-handed, so 


e; X €9 =e3, €2 Xe3 =e], ©€3 Xe] = ep. (63) Note the cyclic pattern based on 
13 2-33-51752-.... 


In such a coordinate system, the curl of F is given by 


1 oO 1 OF 


In contrast with the divergence calculation, we do not need to expand F in 
terms of its components. 


Let us focus on a single component of the curl. We consider (V x F)3, the 
component in the local direction of the unit vector eg. This is found by 
taking the scalar product of equation (64) with e3, giving 


1 OF 1 OF 
(V xP) = pes (ex 5) +e (ex se) 


You may remember that equation (37) of Unit 4 gave the identity 
a-(bxc)=(axb)-c 


which is valid for any vectors a, b and c. 
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Using this identity, we get 


1 OF 1 OF 

FF), = — ee hs oes 

(V x )3 i; (eg x e;) Dus + hs (e3 x e2) Fis 
+ cs (e3 X e3) ef. 
hg . 2 Ou3 


The last term is equal to zero because the vector product of any vector 
with itself is equal to the zero vector. Equations (63) then give 


OF 1 OF 


V X F)3 = — eg: — - — eae, - —. 
( )3 hy es Out ho = Ouy 
Expressing this in terms of the tangent vectors T; = h;e;, we have 
1 OF OF 
V X F)3 = —— [| To: — —-T,-— ]}. 65 
( )s hyhy ( ‘i Out : sm) ( ) 
Using the product rule, the term in brackets can be expressed as 
OF OF 
Toe Ph se 
2 Ou Oug 
_ (AT2-F) _ OT» F)_ O(T, - F) 7 OT, r 
7 Out Ou, Oug Oug 


Now, the second terms in each bracket cancel out because equation (60) 
ensures that 0T2/0u; = OT, /Ouz2. So returning to equation (65), we have 


b /Os<F). Op P) 
F)3 = a et ee ee 
a4 . )3 hyhe ( Out Oug 
Finally, noting that 


we conclude that 


— 1 (d(h2F2) Ohi F)) 
(Vx Fe = 7 ( Ou, Our 


This is equivalent to the third component of equation (52), as required. 
Corresponding results for the other components can be found by 
permuting the indices from (1, 2,3) to (2,3, 1) and (3,1, 2). 
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Solutions to exercises 


Solution to Exercise 1 
(a) Substituting « = rcos¢@ and y = rsin@g, we get 


U(r, ¢) = r? cos? o — r* sin? 6 = r? cos(2¢), 


(b) We have 
V(r,o) = 2r? cos sind = r? sin(2¢). 
(c) We have 
W(r,¢) = (r? cos? 6 +r? sin? Oia = Gy = : 


where we have taken the positive square root because 


r= 2? +y? >0. 


Solution to Exercise 2 
In cylindrical coordinates, r? = x? + y?, and the field is expressed as 

T(r, 6,2) =100e7 F2) (rp? +22 <1). 
The point (r,¢, z) = (0.5, 0,0.5) gives r? + 2? = 0.25 + 0.25 =0.5 < 1, so it 
lies within the domain of the function. The value of the temperature at 
this point is 

T(0.5,0,0.5) = 100e~°° ~ 60.7, 


so the temperature is 60.7°C. 


Solution to Exercise 3 


(a) In cylindrical coordinates, we use the equations x = rcos¢, y=rsing 
and z = z to get 
z Zz 
U(r, 6, 2) = ——— FF oo 
ir 92) (r?2 cos? 6 + r2 sin? @ + z2)1/2— (r2 + z?)1/2 
Here, 7 is the radial coordinate of cylindrical coordinates, which is the 
distance from the z-axis (not the distance from the origin). 


(b) In spherical coordinates, we use the equations x = r sin @ cos ¢, 
y=rsin@sing and z = rcos@ to get 
r cos 6 
(r? sin? 6(cos? @ + sin? ¢) + r? cos? 6)1/2 
_ r cos 6 
~ (r? sin? 6 + r? cos? @)1/2 


r cos 6 


U(r, 0,9) = 


7" 
= cos 0. 


Solutions to exercises 


Recall the identity 
cos 2a = cos? x — sin? x. 


Recall the identity 
sin 27 = 2sin x cos 2. 


Alternatively, equation (6) can 
be used in the denominator of 
O (x,y, 2). 


Alternatively, equation (8) can 
be used in the denominator of 
U(x, y, 2). 
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Solution to Exercise 4 


The magnitude of the vector e,. is 


le,| = \/(cos $)2 + (sind)? = 1/cos? 6+ sin? ¢ = 1, 


and the magnitude of the vector eg is 


leg| = V(—sin¢)? + (cos ¢)? = 1/sin? ¢ + cos? ¢ = 1. 


So e, and eg are unit vectors. 


To show that e, and eg are orthogonal, we evaluate their scalar product: 
e, + eg = (cos ¢)(—sin ¢) + (sin ¢)(cos ¢) = 0. 
Because the scalar product is zero, and neither e, nor eg is equal to the 
zero vector, the two vectors must be orthogonal. 
Solution to Exercise 5 
In cylindrical coordinates, the vector field takes the form 
F(r,¢,z) = Fpe- + Fyeg + Fr ez, 
where 
e, =cosdi+singj, eg =—singi+cosPj, e, =k. 
Using Procedure 1, we get 
F, =e, + F = (cos¢i+singj) + ((x +y)i+ (y—2)j+3zk) 
= (x+y) cos + (y— x) sind, 
Fy = eg-F = (—sin¢i+cos¢j)-((4 + y)i+ (y— 2)j+3zk) 
= —(z+y)sing + (y — x) cos ¢, 
F,=e,-F=k-((x+y)i+(y—2)j+3zk) 
= 32. 
The coordinate transformation equations for cylindrical coordinates are 


xz=rcos¢?, y=rsing, z=2z, 


so we get 

F,, = r(cos¢@ + sin ¢) cos d + r(sin ¢ — cos ¢) sin d 
= r(cos” ¢ + sin? ¢) 
=r, 

Fy = —r(cos ¢ + sin g) sin ¢ + r(sin ¢ — cos ¢) cos ¢ 
= —r(sin? ¢ + cos? ¢) 
om 

F, = 32. 


Hence in cylindrical coordinates, 


Pt 0.2) =Tee = Peg + oe ee 
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Solution to Exercise 6 

In spherical coordinates, the vector field takes the form 
F(r,6,¢) = Fe, + Fg eg + Fy eg, 

where 
e, = sin@cos ¢i+sin@sin dj + cos@k, 
eg = cosOcos i+ cos @sin dj — sindk, 
eg = —sindi+ cos dj. 

Using Procedure 1, we get 
F,, = e,-+ F = (sin@cos ¢i+ sind sin ¢j + cos 0k) - zk = zcos9, 
Fo = eg: F = (cos0cos i+ cos @sin dj — sin@k) - zk = —zsin9@, 
Fy, =eg+F = (—singi+cos¢j)- zk =0. 

The coordinate transformation equations for spherical coordinates are 
x=rsinécos¢, y=rsindsing, z=rcosd, 

so 
F.. = rcos? 0, Fe=-rsin@cosé, Fy =0. 

Hence in spherical coordinates, 


F(r,0,¢) =r cos? Oe, — rsin 8 cos 6 eg. 


Solution to Exercise 7 
Using equations (17), the vector field is 
F = cos 6(sin # cos i+ sin @sin ¢j + cos 0k) 
— sin 0(cos 6 cos di + cos sin dj — sin#k) 
= (cos? @ + sin? 6)k =k. 
Solution to Exercise 8 


The three tangent vectors are 


_O . Oy, Oz 
T, ait a dt ak 
Ob , Oe... a2 
To= 561+ ag + 96 © 
_ 02, Oy. Oz 


In spherical coordinates we have 

x=rsinécos¢, y=rsindsing, z=rcosé, 
so 

T, = sin @cos ¢i+ sin @sin dj + cosék, 

Te =rcosécos¢i+rcosésin dj —rsindk, 

Ty, = —rsin@sin di+ rsin 6 cos @j. 


Solutions to exercises 
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The magnitudes of these vectors are 


hy = 1/sin? 0(cos? ¢ + sin? ¢) + cos? 6 = 1, 


hoe = yr cos? 6(cos? ¢ + sin? ¢) + r? sin? 6 = r, 


hg = 4/7? sin? 6(sin? ¢ + cos? ¢) = rsin8, 


which are the familiar scale factors for spherical coordinates. Hence the 
required unit vectors are 


e, = sin@cos ¢i+sin@sin dj + cosék, 
eg = cosOcos di+ cos @sin dj — sink, 
ey, = —sin gi+ cos dj, 


in agreement with equations (17). 


Solution to Exercise 9 
(a) With V(z,y) =e” +, we have 
OV, OV 
dVaCi47j 
ae Ox v Oy 
= Qe tH 44 Qy et j= Der HH” (e1-- yj). 
(b) With V(a, Y, z) = rae ne, we have 
OV, OV OV 
dV = —i+—j+—k 
ae oe oy ae 
= Qe et tH + i 4 Oy er TUT? 54 22 et tH te" 


= der tH +2" (ci+yj+zk). 


Solution to Exercise 10 
The required figure is shown in the margin. 
Note the following points. 


e The arrows representing grad V at A, B and C are perpendicular to 
the contour lines passing through A, B and C, respectively. 
e =6The arrows all point ‘uphill’, from lower values to higher values of V. 


e The arrows are longer where the contour lines are closer together; this 
is because the scalar field V varies more rapidly where the contour 
lines are closer together. 


Solution to Exercise 11 
To simplify the partial differentiations, we rearrange the expression for 
V(z,y) to give 

V(a,y) = In((2? + y?)"”7) = 3 In(a? +97). 


Then 
ov 1 1 x 


i 
Ox 2a%+y? = x2 + y?’ 


and similarly, 


OV sey 
Oy a2 +y?" 
The gradient of the scalar field is therefore given by 
eee 
2 + y? 


(This is not defined at the origin — but this is not a problem because the 
origin is not in the domain of the field.) 


V(a,y) remains constant along curves for which x? + y? is constant, so the 
contour lines are circles centred on the origin. The diagram in the margin 
shows the contour line through the point (1,1). At this point, WV points 
in the direction of the radial vector i+ j. An arrow representing VV is 
shown on the diagram. This is perpendicular to the contour line, as 
expected. 


Solution to Exercise 12 


Partially differentiating V(x, y, z) with respect to x gives 
OV O 


— = — (07 4 y2 4 22)-3/2 
=—3(2? +y? + 27)-5/? x Qe 
_ 3x 
Similarly, 
OV 3y OV 3z 
dy Wart ™ O27 Wepre 


Hence the gradient is 
321+ 3yj+3zk 


i eee ss le 
(x2 + y? + z2)5/2 


((z,y, 2) # (0,0,0)). 


Solution to Exercise 13 
(a) The gradient of the scalar field is 
VT = —1000 exp(—(a? + 2y” + 2z7)) x (221+ 4yj+4zk) 
= —2000 exp(—(x? + 2y? + 2z7)) (wi+ 2yj + 2zk). 
At the point (1,1,1), the gradient is 
VT 431) = —2000e~° (i + 25 + 2k) ~ —13.5 (i+ 25 +2k), 
to three significant figures, measured in degrees Celsius per metre. 


(b) On moving away from (1,1,1), the temperature increases most rapidly 
in the direction of the gradient at (1,1,1). This is the direction of the 
vector —(i+2j+2k), which has magnitude V1? + 2? + 2? = 3, so the 
corresponding unit vector is 


fh = —3(i+ 2j+ 2k). 


Solutions to exercises 


YA 


VV 


contour 
line 
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Solution to Exercise 14 
(a) The gradient is given by 
VV = 3(02 +y? + 2°)? (Qi + 2yj + 2zk) 
= 3(x? + y? + 22)? (cit yj+zk). 
So at the point (1,2,2), the gradient has the value 
VV (1.2.2) = 9(i + 25 + 2k). 
The displacement vector from (1, 2,2) to (0.98, 1.99, 2.01) is 
6s = —0.02i — 0.01j + 0.01k, 


so the change in V is 
OV ~ VV - os 
= 9(i+ 2j+2k) - (—0.02i—0.01j + 0.01k) 
= 9(—0.02 — 0.02 + 0.02) 
= —0.18. 


(b) The rate of change of V at the point (1,2,2) in the direction of the 
unit vector n = (3i + 4k)/5 is 


= 9i+2j+2k)-$(3i+ 4k) 
— 99 

5 
= 19.8. 


Solution to Exercise 15 


In spherical coordinates, the gradient is 


VV= eee ee ee, 
Or r OO rsind Od 
The required partial derivatives are 
Wa Vag 2 KG 
Or r+’ 06 "Od ; 
So 
VV= -5 e;. 


This gradient field has magnitude 3/r* and points radially inwards, 
towards the origin. 
Solution to Exercise 16 


In cylindrical coordinates, the gradient is 
Of 1 Of Of 


Vf=—e+-—seg+—e. 
Or r z 
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Solutions to exercises 


The required partial derivatives are 


of Of _ 


at = 2rsin(2¢), => = 2r?cos(2¢), => = 2z. 


Ob Oz 


So the gradient is 
Vf = 2rsin(2¢) e, + 2r cos(2¢) eg + 2z ez. 


The unit vectors e,, eg and e, are mutually orthogonal, so the square of 
the magnitude of the gradient is given by the sum of the squares of its 
components: 


|V f|? = 4r? sin? (2d) + 4r? cos?(2d) + 427 = 4(r? + 27). 


Hence the magnitude of the gradient is 


|Vf| = 2Vr? 4+ 2?. 


Solution to Exercise 17 


(a) 


In spherical coordinates, the gradient vector field is 
OT 1 OT 1 OT 
T =e, +~ sep t+ —— 
= Or © ie rey: ag “* 
The required partial derivatives are 
OT OT OT 
ap — sine, Be 70084, — =0. 


So 
VT =sin6e, + cos eg. 


The rate of change of T in the direction of the unit vector 
Ni = (e9 + eg) /V2 is 


a I ; 
n- VT = —~(eg + eg) - (sind e, + cos 6 eg) The vectors e,, eg and eg are 
v2 mutually orthogonal and of unit 
1 magnitude. 
= —cos@. 
v2 


Note that the required rate of change is the component of the gradient 
in the direction of the unit vector n, which is most readily evaluated 
in spherical coordinates in this case. 


Solution to Exercise 18 


In polar coordinates, 


v= 


Ll 2 


°" Op 


In cylindrical coordinates, 


V =e, 


a,,10,, a 
Or |? r Ab) 7 Az 


In spherical coordinates, 


) 10 
V=e,—+e-—-+e 
r 


l. «2 


Or a6’ rsind Od 
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Solution to Exercise 19 
OF, OF, OF, 


V-F= — 
(@) Or | Oy | Oz 
= Qry + 2yz — 2yz = 2xy. 
(b) v. qa 2G 4 Gy , OC: 


Ox Oy Oz 
=2(r+y)4+2(y+2)+2(a4+ 2) =4(a@+y+4+2). 


Solution to Exercise 20 


(a) It makes no sense to take the divergence of a scalar field. The formula 
for divergence involves the components of a vector field, but a scalar 
field has no direction so there are no components to use. 

(b) Given that V = x* + y* + 24, the gradient of V is 

OV OV OV 


VV = —i+ —j4+—k = 4 i 4+ fj + 4ek 
da oy” ae pe a ee 


= 12(2? + y? + 27). 


Solution to Exercise 21 


(a) To partially differentiate F,, = x/,/x? + y? with respect to x, we use 
the quotient rule: 
OF, (a? +y?)¥/? — 2 (F(a? + y?)-1/? x 2) 
Ox x? + y? 
_ tty? —o? y? 
= (x? 4 y2)3/2 — (x? Es y2)3/2° 
Because of the symmetry between F;, and F,, a similar result is 
obtained for OF ,/Oy, but with « and y interchanged. There is no 
z-component in this two-dimensional case, so OF, /0z = 0. Hence 


y2 ga 
Wee (x2 + y2)3/2 un (x2 + y2)3/2 +0 
g* + y? 1 


a es : 0,0)). 
GPP Ta (9 # (0.0) 
(b) The divergence V - F calculated in part (a) is positive at all points 
(except the origin (0,0), where neither F nor its divergence is defined). 
We therefore expect F to diverge away from a typical point, such as 
that marked by the blue dot in the figure provided with the question. 


This interpretation is supported by examining the figure. Let us 
suppose that F represents the flow of a fluid. Compare the arrows 
that enter the square on sides AB and AD with those that leave the 
square on sides BC and C'D. All these arrows have the same length, 
but those on sides AB and AD are more closely parallel to the sides of 


the square than those on sides BC and CD. This means that less 
fluid is carried into the square across sides AB and AD than is carried 
out across sides BC and C'D. So there is a net flow of fluid out of the 
square, as expected for a field with positive divergence. A more 
systematic discussion of the flow of fluids across surfaces will be given 
in the next unit. 


Solution to Exercise 22 
(a) Using equation (45) for divergence in cylindrical coordinates, and 
noting that F,, = 4r?, Fy =0 and F, = 0, we get 

1 O(4r*) 
r Or 

(b) Similarly, with G, = r?sing¢, Gg = 0 and G, = z?, we get 

V-G- . O(r3 sin ¢) 4 O(z7) 

r Or Oz 


Solution to Exercise 23 


V-F= = 16r?. 


= 3rsing + 2z. 


(a) Using equation (46) for divergence in spherical coordinates, with 
F, = 4r?, Fg = 0 and F, = 0, we get 
1 O(4r°) 
r2 Or 
Note that the vector field F is different to that in Exercise 22(a) 
because the unit vector e, in spherical coordinates is not the same as 
the unit vector e, in cylindrical coordinates. It is therefore not 
surprising that the divergences of these fields are different. 


V-F= = 20r?. 


(b) Similarly, with G, = 0, Go = rsin? 6 and Gz = 1 cos @cos ¢, we get 


1 A(rsin® @) 1 O(rcos@cos ¢) 
rsin@ 06 r sin 0 Od 


_ 3rsin?@cos@ rcosOsing 


V:-G= 


r sind rsind 
= 3sin @cos 6 — cot Osin ¢. 


Solution to Exercise 24 


Using the expression for divergence in spherical coordinates, and the given 
fact that div F = 0 at all points except the origin, we get 
1 A(r*f(r)) 
r2 Or 
So 
A(r? f(r) 


ti = 


Or 


Integrating both sides of this equation, we conclude that r? f(r) = C, 
where C is a constant, so f(r) = C/r?, which is proportional to 1/r?. 


=0 forr>0. 


Solutions to exercises 
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Solution to Exercise 25 
We have 


ore (2(45)-&(%) 
- (E+ Lente Vy x20) 
(x? + y?) (x2 + y?) 
= (PE) = 0 (eu) 4 (0.0). 


Solution to Exercise 26 


(a) VxF= a ay Az 


Z-Y 2-2 y-=ux 


Expanding the determinant, we obtain 


The minus sign in front of j Vx F=i (~ =o) = O(n — *}) _i (~ ~ #) _ Oz — 2) 
comes from the rule for Oy Oz Ox Oz 
expanding a 3 x 3 determinant. O(x—z) OA(z-y) 
+k —- ———_ 
Ox Oy 
40 ()) fe) 
= 214+2j+2k. 
i j k 
O 6) a) 


zy? x22 yx? 


4 (a 7 aa | (2 atey")) 


Oy Oz 


= i(a? — 2ez) —j (Qey — y*) +k (2? — 2yz). 
So 


V xXx G=a2(a — 2z)i+ y(y — 2x) j+ 2(z — 2y)k. 
Solution to Exercise 27 


The required curl is 


ij k 
a a a 
VxVWU=|dz dy dz 
aU aU au 
dx Oy Oz 
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This evaluates to 


ff eV ou ou ou 
Mice a ao 7 a a (ss - om) 
0?U 0°?U 
+k {—— - =~}. 
OxOy OyOu 


Within each of the brackets on the right-hand side, the two mixed partial 
derivatives are equal: 


OU _ Oru OU... OU OU oy 
OyOz Ozdy OxOz Ozdx’ OAxOy OyOx 
We conclude that V x VU = 0 for any scalar field U. 


Solution to Exercise 28 
(a) The field F has cylindrical components F,. = 0, Fy = 0, F, = r?, so 


oe. Te, © 


1| 0 O O 
Vat iber oe. oe 
0 0 r2 
1 
= = (e, (0) — reg (2r) +e. (0)) 
= —2r eg. 


(b) The field G has cylindrical components G, = 0, Gy = rz, G, = 0, so 


e, Teg & 
lio .& @ 
0 #2. 0 


= = (e, (=r?) — reg (0) +: (2r2)) 


(c) The field H has cylindrical components H, = rzsing, Hg = 0, 
H, =0, so 
e, reg ez 
1 ) O O 
VxH=-} — —_—_ — 
r| Or Od Oz 
rzsing 0 0 


= : (e, (0) — reg (—rsin d) + e, (—rz cos o)) 


=rsin deg — zcos Pez. 
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Solution to Exercise 29 


The polar components of F are F,, = 0 and Fy = 1/r, so equation (54) 


gives 
1/00)  2a(0) 
vxP=i (5 ake 
=0 (r-0). 


Solution to Exercise 30 


(a) The field F has spherical components F, = 0, Fg = r and Fy = 0, so 
equation (55) gives 


e, reg rsinfey 


1 O O O 
VRE = caaplgp 88 o¢ 
0 Fr 0 

1 


= (e, (0) — reg (0) + rsin deg (2r)) 


=2 eg. 
(b) G has spherical components G, = 0, Gg = 0 and Gg = rsin8@, so 
e, reg rsindeg 
1 jaa a 
r2sin@|Or 06 Og 
0 0. r?sin?6 


VxG=H= 


1 
= (e, (2r? sin 0 cos 0) — r eg (2r sin? 0) + r sin 8 ey (0)) 
= 2cos0e, — 2sin deg. 
(c) H has spherical components H, = r?, Hg = 0 and Hg = 0, so 


er reg rsinfeg 


1 O O O 
VN aint lap 108. oe 
r2 0 0 


= 0. 


196 


Acknowledgements 


Acknowledgements 


Grateful acknowledgement is made to the following sources: 


Figure 2: Hans G / www.flickr.com/photos/48351129Q@N08/10223373043. 
This file is licensed under the Creative Commons Attribution-Share Alike 
Licence http: //creativecommons.org/licenses/by-sa/3.0. 


Figure 35: NASA. 


Every effort has been made to contact copyright holders. If any have been 
inadvertently overlooked, the publishers will be pleased to make the 
necessary arrangements at the first opportunity. 


197 


Unit 10 


Integrating scalar and vector fields 


Introduction 


Unit 8 explained how to carry out area integrals over flat surfaces, surface 
integrals over curved surfaces, and volume integrals in three-dimensional 
space. However, there is another type of integral that is important to us — 
an integral along a curve, known as a line integral. 


You are familiar with the idea of integrating along a straight line. Figure 1 
shows a straight rod lying along the x-axis with one end at « = 0 and the 
other end at x = L. The rod may have an uneven distribution of mass. Its 
linear density (its mass per unit length) is then a function of position, 
which we denote by A(x). The mass of a short segment of the rod between 
x and «+ 6x is given by 


dm ~ (x) dx, 


and the mass of the whole rod is found by adding up the masses of all of 
its segments. In the limit where the rod is divided into an infinite number 
of infinitesimally short segments, the sum becomes an integral and the 
total mass of the rod is expressed as 


M= [ Melide. 


A new feature introduced in this unit is to allow the rod to be curved 
(Figure 2). The task of finding the mass of a curved rod can again be 
approached by dividing the rod into many short segments, finding the 
mass of each segment, and adding all the contributions together. In the 
limit where the segments become infinitesimally short, the sum becomes an 
integral. However, this integral is not along the x-axis, but is along the 
curved path occupied by the rod. Such an integral is called a line integral. 


There are a couple of problems that must be solved before we can evaluate 
a line integral of this sort. When we split the curved rod into segments, we 
must have a way of labelling the different segments. This is done by 
choosing a parameter t that increases smoothly from one end of the rod to 
the other. We can then talk about a segment of the rod for which the 
parameter has values between t and t + dt. We also need to find an 
expression for the length of a segment in terms of t and dt. Once these 
problems have been solved, the line integral can be expressed as a definite 
integral over the parameter t. You will see how this works in Section 1. 


In Section 2, you will also see how to define and evaluate the line integrals 
of vector fields. The idea is simple enough: at each point along the path, 
we take the component of the field that is parallel to the path, and then 
integrate this component along the path. This turns out to be a very 
powerful tool in physical applications. 


For example, when you lift an object, you expend energy working against 

gravity (Figure 3) and the lifted object gains energy as a result. The object 
may be lifted straight upwards, or in a circular arc, or in a spiral — however 
you like. Whichever path is chosen, the energy transferred to the object by 
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0 x L 
Figure 1 A straight rod 


Figure 2. A curved rod 


Figure 3 Lifting weights 
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Figure 4 A diverging vector 
field V and a spherical 
surface S (cross-sectional 
view) 


Figure 5 A circulating 
vector field V and a closed 
path C 


202 


the applied force is given by the line integral of the force along the object’s 
path. This is the line integral of a vector field. Such calculations give 
physicists a precise language for talking about energy transfers. 


You might ask whether the energy needed to move an object from one 
point A to another point B depends on the detailed path taken, or just the 
points A and B. The answer depends on the forces acting. In 
mathematical terms, we need to find the conditions under which line 
integrals depend only on their start and end points. Section 3 will show 
that some types of vector field have line integrals that are always 
path-independent. Such fields are said to be conservative. 


The remainder of the unit uses integrals to get a deeper, and more 
powerful, understanding of divergence and curl. You will remember that 
Unit 9 made tentative interpretations of divergence and curl: divergence 
was associated with the local outflow of a vector field, and curl with a local 
rotation or swirling. In this unit we will be more precise. 


Figure 4 shows a vector field V, together with an imaginary spherical 
surface S. Let us suppose that the field represents the flow of a fluid. Then 
it is apparent from the diagram that there is a net flow outwards, from the 
inside of the sphere into the exterior space. We can quantify this outward 
flow by integrating the outward normal component of the field over the 
surface of the sphere. This involves a type of surface integral called the 
flux of the vector field. We will use this idea to establish a precise link 
between divergence and outflow. 


Something similar can be achieved for curl. Figure 5 shows a vector field V 
that represents a different type of fluid flow — one that circulates rather 
than radiates outwards. The figure also shows an imaginary closed path C, 
traversed in an anticlockwise sense (as indicated by the blue arrow). Then 
we can calculate the line integral of the vector field V around the path C. 
This gives a quantity called the circulation of the field around C’. We will 
use this idea to establish a precise link between curl and rotation. 


The links between flux and divergence, and between circulation and curl, 
are encapsulated by two important theorems — the divergence theorem and 
the curl theorem — which are discussed and used in Sections 4 and 5. 
These theorems give physicists and engineers powerful tools for exploring 
real phenomena involving fields, and are essential knowledge for anyone 
who wants to understand electromagnetic fields or fluid flow. 


Study guide 


This unit builds on all the previous units in this book. You need to be 
familiar with partial derivatives from Unit 7, surface and volume integrals 
from Unit 8, and gradients, divergences and curls from Unit 9. 


The first half of the unit deals with line integrals of various types. 
Section 1 covers line integrals of scalar fields, while Section 2 covers line 
integrals of vector fields. Special vector fields with path-independent line 
integrals are discussed in Section 3. 


1 Line integrals of scalar fields 


The second half of the unit uses integrals to gain more powerful insights 
into divergence and curl. Section 4 defines the flux of a vector field, and 
relates it to divergence using the divergence theorem. Section 5 defines the 
concept of circulation, and relates it to curl using the curl theorem. 


1 Line integrals of scalar fields 


Imagine a whale taking a meandering journey through the ocean. Within 
the ocean there are huge numbers of plankton, the whale’s staple food 
supply. The plankton are distributed unevenly, so the number of plankton 
per unit volume is a function n(r) of position r. 


Let us break down the whale’s journey into many short steps or segments. 
The ith step starts at position r; and is of length 6l;. The number of 
plankton encountered by the whale during this step is 


ON; & n(r;) A dk, 


where A is the area of the whale’s open mouth (assumed to be 
permanently open). We can also write this as 


5N; ~ i Sli, 


where A; = n(r;) A is the number of plankton per unit length that are 
within range of the whale’s open mouth in step 7. 


The total number of plankton N encountered by the whale during its 
journey is found by adding together contributions from each step, so 


Nae yl 


This is an approximation because the number of plankton per unit length 
may vary within a step. We really ought to consider the sum in the limit of 
an infinite number of steps, each of vanishingly small size. In this limit, 
the sum becomes an integral and the total number of plankton 
encountered is written as 


n= [ r(eyat 


An integral like this is called a line integral. Of course, we have not told 
you how to evaluate such an integral — we will do that shortly; for the 
moment, we focus on the concept and the notation that expresses it. 
Notice that the integral sign does not have lower and upper limits. 
Instead, it carries the symbol C, which labels the whale’s path. In general, 
the detailed path C' matters, not just its start and end points. Along some 
paths between given start and end points, a whale may encounter many 
plankton; along others, it may find very few. 
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A path is more than a curve! 
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Figure 6 An anticlockwise 
path around a quarter-circle 


The minimum and maximum 
values of t always correspond to 
the start and end points of the 
path, respectively. 
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Another example of a line integral arises when we calculate the length of a 
path. How far does the whale swim as it travels along a path C’? We again 
divide the path into many segments. The total length of the path is then 
approximated by 


Le > 6l;, 
where 61; is the length of the ith segment. 


Taking the limit of an infinite number of segments, each of vanishingly 
small length, the total length of the path is given by the line integral 


ao ldl. 
Cc 


The notation is deceptively simple. Remember that the integral is along 
the path C’, and of course the length obtained depends on the path. In 
general, you cannot just do the integral over / and then substitute in limits. 
There is a special technique for doing line integrals, which we now explain. 


1.1 Line integrals in Cartesian coordinates 


In this subsection, we discuss line integrals of scalar fields along given 
paths, calculated using Cartesian coordinates (2, y, z). 


A path is a curve with a definite sense of progression from a start 
point to an end point. In diagrams, the sense of progression may be 
indicated by an arrow, as in Figure 6. 


Parametric representation of a path 


The concept of a line integral along a path is based on the idea of splitting 
the path into many short segments. In the discussion above we labelled the 
segments by an index 7, but this is not convenient for calculations. Instead, 
we choose a continuous parameter ¢ that increases smoothly from the 
beginning of the path to the end. We can then talk about a segment of the 
path for which the parameter has values between ¢t and t + ot. 


We must specify the shape of the path. This is done by expressing the 
coordinates of points on the path as functions of the parameter t. For a 
path in three-dimensional space, we write 


e=2(t);. y=ut), e=et) (i <t< ty), 


where t = ft; at the start point and t = tg at the end point. Such a set of 
equations is said to give a parametric representation, or a 
parametrisation, of the path. For two-dimensional paths in the ry-plane, 
z(t) = 0, but this is usually left out of the description. 


1 


Let us take the path in Figure 6 as an example. This path starts at the 
point A = (R,0) on the z-axis, progresses anticlockwise around a circular 
arc of radius R, and ends at the point B = (0, R) on the y-axis. You can 
imagine an insect crawling along this path, starting from A at time t = 0, 
and travelling at a steady rate until it reaches B at time t = 7/2. Then at 
time t, the x- and y-coordinates of the insect are 


ti) = Roost, yij]—Remt (0 <1 = 7/2). (1) 


These equations provide a parametric representation of the path. They can 
also be written in the vector form 


r(t) = Reosti+ Rsintj, 


where r(t) is the position vector of the insect at time t. The relationship 
between r(t) and the components in equations (1) is shown in Figure 7. 


The parameter t always increases along the path, so in our analogy, the 
insect always moves forwards. However, the insect could progress in a 
variety of ways, giving a variety of parametric representations. For 
example, we could have 


a(t) = Reos(t”), y(t) = Rsin(t?) (0<t< V/7/2). (2) 
This is an equally valid parametric representation of the path in Figure 6, 
but corresponds to a non-uniform rate of progression. 


The picture of an insect crawling along a path is just a device to aid 
understanding. The parameter ¢ need not represent time — it could be any 
quantity that increases along the path. In particular, if we consider the 
path traced out by a whale as it swims through the ocean, the parametric 
representation of this path need not describe the location of the whale as a 
function of time! 


Each parametrisation defines a certain path. The following parametric 
equations correspond to the path in Figure 8, which is a quarter-circle 
traversed clockwise: 


Hij=Aemt, ya)=]Aeost (0<e<a/2). 


This is not the same as the path in Figure 6 because the sense of 
progression has been reversed. 


In some cases, the start point and end point are identical. For example, 
the equations 


x(t)= Reost, y(t)=Rsint (0<t < 2r) 


represent the circular path shown in Figure 9, which starts and ends at the 
point (R,0) on the x-axis. In general, paths that have distinct start and 
end points are said to be open, while paths with identical start and end 
points are said to be closed. Any journey where you leave your house in 
the morning and return in the evening is a closed path. 
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Rsint 


Reost 


Figure 7 Cartesian 
components of a point on a 
path 


R x 
Figure 8 A clockwise path 
around a quarter-circle 


—R 


Figure 9 An anticlockwise 
path around a circle 
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Recall that if a= az i+ a,j, 


then |a| = ,/a2 + a2. 
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The length of a short segment of a path 


Figure 10 shows a short segment of a path that lies in the ry-plane. This 
segment begins at point P, with parameter value ¢ and coordinates (z, y), 
and ends at point Q, with parameter value t + dt and coordinates 

(x + dx, y + dy). 


of Q t+ ot 


(x + 6a, y + dy) 


by 


Rv 


Figure 10 A short segment of a path 


The displacement vector from P to Q is given by 


ds = dxit dyj. (3) 
Dividing and multiplying the right-hand side by dt, we get 
ox, oy, dx, dy, 
— — —— ~~ —_ — 4 
os (Sis i) a (Siva) oe (4) 


where the last step follows because we are assuming that dt is very small. 
The magnitude of this displacement is the square root of the sum of the 
squares of the components. Because 6t > 0, we get 


dx \? dy a 
|ds| ~ (F) +() ot. 


This approximates the distance between the points P and Q. We are 
interested in the curved segment of path between P and Q, which has 
length 61. However, if the curve is reasonably smooth, and P and Q are 
very close together, then 6/ is well approximated by |ds|. In the limit where 
the points P and @ approach one another, any error introduced by this 
approximation becomes negligible. We can therefore say that the length of 
a tiny segment of the path, with parameter values between t and t + dt, is 


ax /(#)+(*« 6 


This result applies to any path confined to the xy-plane. 


For a path in three-dimensional space, we go through a similar argument 
but include the z-coordinate. The displacement vector then becomes 


ds = 6v1+ dyj+dzk, 
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and the expression for the length of a tiny segment of the path is 
dx\? dy > dz\? 
an [(*)+(®) + (*) © 
The length of a path 


The simplest use of line integrals is to find the total length of a path. We 
begin with the two-dimensional case. 


Suppose that we have a path C in the xy-plane, starting at point A and 
ending at point B. To find the length of this path, we must add up the 
lengths of all of its segments, taking the limit of an infinite number of 
segments, each of infinitesimal length. In this limit, the sum becomes an 
integral, and the total length L of the path is given by the following 
definite integral, where ¢t; and tz are the parameter values at the start and 
end of the path. 


Length of a path in the xy-plane 


fee 0 


Example 1 


Use the parametrisation of equations (1) to find the length of the 
quarter-circle path in Figure 6. 


Solution 
The parametric equations are 


alt) = Reet, o@)=—=Rsnt (0<t< 772), 


so 
dx : dy 
He Rint and aE Feeost- 
Hence 
dx\*  (dy\? oe. a 2 
(Gz) + (Ge) = Rainn? + (Roose 


= R?(sin? t + cos’ t) = R?. 
Using equation (7), the total length of the path is 


a/2 m/2 
t= via = f Rat ==R, 
0 0 


as expected for a quarter of the circumference of a circle of radius R. 
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We can calculate the length of the same path using the alternative 
parametrisation of equations (2). In this case, the parametric equations are 


a(t) = Reos(t?), y(t) = Rsin(t?) (0<t < W7/2), 


and the chain rule of ordinary differentiation gives 


- — —Rsin(t”) x 2t and a = Rcos(t”) x 2t. 
Hence 

dx \* dy\” 242 (os 2742 2/42 2,2 

a 4. a = 4R*t? (sin? (t*) + cos*(t*)) = 4R°t?, 


and the total length of the path is 


\/ 7/2 
L= | 2Rtdt =2R [1?]V"? = Lap, 
0 


as before. 


Any valid parametrisation can be used, and the answer will always be the 
same, but some choices make life easier than others! 


A line integral along a given path does not depend on the choice of 
parametrisation. You are free to choose any parametrisation you like, 
provided that it gives a correct representation of the path. 


You need not worry about which parametrisation to use; where it is 
not obvious, we will always suggest an appropriate choice. 


We can also consider the reverse path, shown in Figure 8. This occupies 
the same quarter-circle curve as before, but the start and end points are 
interchanged, so the path is traversed in the reverse sense. This reverse 
path is parametrised by the equations 


tij=Aené, ywt)=Aeost (05 ¢ < 7/2). 


It is intuitively obvious that this path must have the same length as 
before, and this can be easily verified. We have 


d d 
<, = Roost, <, = —Rsint, 
So . : 
dx dy 2 1 4\2 2 
aati ee) _ ty’? = 
(=) ao (4) (Rcost)* + (—Rsint)* = R*, 
giving 


m/2 
b= | Rdt = 37R, 
0 


as before. 


In general, the lengths of curves, and the line integrals of scalar 
functions, do not depend on the sense in which a path is traversed. 


Why do we bother to distinguish between curves and paths? You will see 
that this distinction does matter for the line integrals of vector fields. Our 
terminology bears this in mind. 


Exercise 1 

The parabolic arc shown in the margin has parametric representation 
ety=2t, gia? (1<t< 1). 

What is the length of this arc? You may use the standard integral 


[viv Pac = 3 (oI Fe + Inlet 1+2°)) +C. 


The method is easily extended to paths in three-dimensional space. In this 
case, the parametric representation gives x, y and z as functions of the 
parameter t. The formula for the length of a segment is given by 

equation (6), and the total length of the path is given by the following 
expression. 


Length of a path in three dimensions 


(8) 


where ¢; and tg are the parameter values at the start and end of the 
path. 
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Exercise 2 
One turn of a helical path has the parabolic representation 
fetes, youn, 2 (0 ta 27), 


where a and 0 are positive constants. Find the length of this path. 


Line integrals of scalar functions 


We sometimes need to integrate a scalar function along a given path. For 
example, we might find the total mass of a curved rod by integrating its 
linear density (its mass per unit length) along the rod. 


Starting at one end of the rod, we follow a path C that tracks along the 
rod and stops at the other end. Then the total mass of the rod can be 
expressed as the line integral 


u=f ddl, 
Cc 


where A is the linear density at points along the curved rod. 
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In two dimensions, the path is described by parametric equations 
c=ax(t), y=ylt) (ti St< te), 

and the linear density of the rod is given by a function 
A= Ax(t), y(t). 

We also know that a short segment of the rod has length 


dx \? dy . 
~ — — t. 
i (=) +(4) 5 


This leads to the following expression for the total mass of the rod. 


M= i NEOaIEY / S) 4 (2) at (9) 


A similar formula applies in three dimensions, with obvious adjustments to 
include z(t) and (dz/dt)?. Formulas like this also apply to other quantities 
that are given per unit length of a path. For example, A could represent 
the number of accessible plankton per unit length along a whale’s path. 


Example 2 


A non-uniform curved rod lies in the xy-plane. Its coordinates (in metres) 
are given by the parametric equations 


et) =2t yij=1—-F (-1<%< 1). 


The linear density of the rod (in kilograms per metre) is 


1 
XO a a 


What is its mass? 
Solution 


From the given parametric equations, we have 


jay HaH=4F 417 PF =F oP Siar y, 


so 
1 
A(x(t), y(t)) = The 
Also, 
dx dy 
dt =) nd dt = —2t, 
so 


1 


The mass of the rod is therefore 


1 1 
M= [ : x a/i+Pat=2 [ 
=i 


14+? 


1 
—— dt. 
Vite 
Using a standard integral given in the Handbook, we get 
1 
M=2[n¢+ Vi+#?)| 
= 2In(V2 + 1) — 2In(V2 — 1) ~ 3.53. 


So the mass of the rod is approximately 3.53 kilograms. 


Exercise 3 
A semicircular path C' has parametric representation 
a(t)=Reost, y(t)=Rsint (0<t<n), 


where R is a positive constant. On this path, the linear number density of 
ants (i.e. the number of ants per unit length of path) is given by 


A 
Xa, y) = Rt ry, 


where A is a positive constant. What is the total number of ants on the 
path? 


1.2 Line integrals in orthogonal coordinates 


Line integrals can also be evaluated using non-Cartesian coordinates. 
In this subsection, we describe how this is done. This is optional 
material, and will not be assessed. However, you should read it if you 
are interested in physics or astronomy, as there are close links with 
the mathematics used in Einstein’s theory of relativity. 


As an example, let us see how the length of a path is calculated in polar 
coordinates. We consider a path in the xy-plane that is defined using polar 
coordinates (r,¢@). This means that its parametric representation is given 
in the form 


r=r(é), ¢=¢@) (4<t<%). 


Figure 11 shows the short segment between points P and Q. The point P 
has polar coordinates (r,¢) and parameter value t, and the point Q has 
polar coordinates (r + dr, @ + 6) and parameter value t + ot. 
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Figure 11 A path segment 
in polar coordinates (the 
angle 6@ is magnified for 
clarity) 
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“br 
Figure 12 Resolving ds 
along polar unit vectors 


Sv 


212 


The displacement vector ds from P to Q can be resolved into a radial 
component dr in the direction of the unit vector e,, and a transverse 
component 7 d¢ in the direction of the unit vector eg (see Figure 12). We 
therefore have 


ds = dre, + r db eg. (10) 


Note the factor r in the last term on the right-hand side. This is the scale 
factor needed to convert a change in angle, 6¢, into an appropriate length. 


Bearing the scale factor in mind, we can follow the same argument as that 
given earlier for Cartesian coordinates. In polar coordinates, the 
expression for the length of a small segment of the path is 


dr F do 
~ — + r2{ 


and the total length of path between points with parameter values t = t, 
and t = ts is 


b= f° (£) +2 (2) a an) 


To illustrate this result, consider an anticlockwise circular path of radius R, 
centred on the origin. This may be described by the parametric equations 


r(t)=R, o(t)=t (0<t<2n). 
Since dr/dt = 0 and d¢/dt = 1, the total length of the path is 


20 ui 
L= VOTRE x iat = f Rdt = 2rR. 
0 


2 
0 


Exercise 4 


In polar coordinates, the spiral path shown in the margin has parametric 
representation 


r(t)=2t, d(t)=t (O<t<5). 
Find the length of this path. 


(Hint: You may use the standard integral given in Exercise 1.) 


The formula for the length of a path in polar coordinates can be 
generalised to other orthogonal coordinate systems. 


Suppose that a path in three-dimensional space is described in orthogonal 
coordinates (u,v, w). The corresponding scale factors are denoted by hy, 
h, and hy, and the unit vectors by e,, e, and ey. 


The path is then specified by parametric equations of the form 
e=ut), v=0h), wwe) (Sts i). 


We consider two neighbouring points on the path: P with coordinates 
(u,v, w) and parameter value t, and Q with coordinates 
(u + du,v + dv, w+ dw) and parameter value t + ot (see Figure 13). 
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Q t+dt 
(u+ du, v + dv, w + dw) 


Figure 13. Neighbouring points P and Q on a path in (u, v, w) 
coordinates. Inevitably, this sketch is two-dimensional, but the path need 
not be planar. 


The displacement vector 6s from P to Q can be resolved in the directions 
of the e,, e€, and e,, unit vectors. For example, the component in the 
direction of e, is hy du. This follows directly from the definition of scale 
factors. There are similar results for the other components, so the 
displacement vector from P to Q is given by 


6s = hy, due, + hy dv ey + hy dw ey. (12) 


Using the same argument as before, we reach the following conclusion. 


Length of a path in an orthogonal coordinate system 


In an orthogonal coordinate system (u,v, w), with scale factors hy, hy 
and hy, the length of a path between points with parameter values 
i=in aia ¢ = ve 1s 


dv 


2 a 2 
2 || —— ; iL 
7 +1 (2) dt (13) 


eal 
U dt ‘O} 
The main point to note about this formula is the presence of the scale 
factors. These were not apparent in Cartesian coordinates (x, y, z) because 
all the scale factors are equal to 1 in this case. In polar coordinates (r, ¢), 
the scale factors are h; = 1 and hg =r, and there is no third coordinate, 


so in this special case we recover equation (11). For reference, some useful 
scale factors are listed in Table 1. 


Table 1 Some scale factors 
Coordinate system Scale factors 


Polar coordinates (r, b) hp =1, hg =r 
Cylindrical coordinates (r,¢,z) hr=1,hg=r,hz=1 
Spherical coordinates (r,0,¢) hy =1, he =r, hg =rsin@ 


We can apply equation (13) to paths on the surface of a sphere, such as 
those that describe journeys on the surface of the Earth. It is natural to 
use spherical coordinates (r, #,@) in this case. The advantage of this choice 
is that the radial coordinate does not vary: it has the constant value R, 
the radius of the sphere. 
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The parametric equations of a path on the surface of the sphere then take 
the form 


6=0(t), d=¢(t), r=R (4 <t<tp). 


Using equation (13), and taking scale factors from Table 1, the length of a 
path on the surface of a sphere is given by 


tg do 2 = do 2 
L=R| (=) + sin (=) dt. (14) 


Numerical methods are often needed to evaluate this integral, but the 
cases considered in the following exercise can be done by hand. 


Exercise 5 


Paths A and B lie on the surface of a sphere of radius R, and have the 
following parametric representations. 


Path A: O(t)=t, g@=7/4 (O<t< 7/2). 
Path Br @j=a/6, o=t OAs a2). 
Use equation (14) to find the lengths of these paths. 


Towards Einstein’s theory of general relativity 


When considering paths on a curved surface, it is natural to ask: 
what is the shortest path between two given points? Such paths are 
called geodesics. Long-distance plane routes generally follow 

—— — geodesics, although slight adjustments may be made for winds and 
Figure 14 A geodesic path weather (Figure 14). The first step in finding geodesics is to have a 
from London to Los Angeles formula for the length of a path, and this is provided by 
equation (13), although more work is needed to identify the geodesics. 
It turns out that in Exercise 5, path A is a geodesic but path B is not. 


In 1916, after years of struggle, Albert Einstein (Figure 15) created 
his theory of general relativity. This is a theory of motion under 
gravity. Rather than dealing with ordinary space, general relativity 
deals with spacetime, which combines the three dimensions of space 
with time. Along the track of a particle in spacetime, a quantity 
called proper time increases steadily; this is analogous to the length 
of a curve in ordinary space. 


General relativity is based on two extraordinary ideas. First, it asserts 
that spacetime is curved by matter. Then it says that when moving 
under gravity, a body follows the path of maximum proper time. By 
analogy with ordinary space, such a path is called a geodesic. So to 
Figure 15 Albert Einstein predict motion under gravity, we must find the geodesics in spacetime, 
(1882-1955) and this brings line integrals into the heart of general relativity. 
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2 Line integrals of vector fields 


You have seen how to integrate scalar fields along given paths. It is also 
possible to integrate vector fields along paths, and this section explains 
how this is done. 


2.1 The basic concept 


A simple example will lead us towards the main idea. In a 100 metre race, 
competitors sprint along a straight track. When analysing the times 
achieved, officials often record the component of wind velocity in the 
direction of the race. Fast times are less impressive in a strong following 
wind, and world records cannot be claimed if such a wind is present. 


Longer races generally follow curved paths. For distances above 

400 metres, the runners complete one or more laps of the stadium. In these 
cases, it is not relevant to record the component of the wind velocity in 
any single direction. Assuming that the wind velocity field remains 
constant in time, a more suitable measure of wind assistance may be 
obtained as follows. 


e At each point along the path of the race, measure the component of 
the wind velocity along the direction of the path. 


e Integrate this component round the path, from its start point to its 
end point. 


In races round complete laps, we might expect the wind to be in the 
runners’ backs at some points, and in their faces at others. But we can 
also imagine situations where the wind swirls around the stadium like a 
gentle eddy, consistently helping the runners on their way. In practice, 
athletics officials do not concern themselves with such details, but this 
does not matter. The main point of our example is that it suggests a way 
of defining the line integral of a vector field. 


The concept of the line integral of a vector field 


Given a vector field v, and a path C leading from a start point to an Yr ek v 
end point, the line integral of v along C is defined as follows. 
At each point along the path, we take the component of v in the . =r 
direction of the path, and then integrate this along the path. C 
¢—_4—_>___2__+— 
0 3 «2 
As a simple example, consider the line integral of the vector field A f / 
v(z,y) = 2*(yt lite(y-1)j t } 
along a path C’ that travels along the x-axis, starting at « = 0 and ending 


at x = 3. Figure 16 is an arrow map of this vector field, with the path C Figure 16 A vector field 
shown in blue. v(x,y) and a path C 
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Figure 17 An arrow map of 
a vector field F(z, y) (orange) 
and a path C (blue) 


Yi+1 


Figure 19 The ith step 
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In this case, the component of v in the direction of the path is v,. Along 
the path C, the component v; has the value x?(0 + 1) = x?. So the line 
integral of the vector field v along the path C is 


@=3 3 
| ag? dx = [$2°]5 =9. 
«z=0 

Notice that the answer is a number. A similar result applies to all line 
integrals of vector fields — their values are scalars (i.e. numbers, or numbers 


with associated units). 


This illustrates the concept of the line integral of a vector field in the 
special case where the path of integration is along a coordinate axis. But it 
does not give us a reliable way of calculating line integrals in general. More 
usually, the path of integration is curved, and the coordinates of points on 
the path are given by parametric equations. The next subsection will show 
you how to evaluate line integrals of vector fields in this general case. Once 
we have a suitable formula, the line integrals of vector fields are just as 
easy to evaluate as those of scalar fields. 


2.2 Line integrals in Cartesian coordinates 


Suppose that we are given a two-dimensional vector field F(x, y), and we 
want to calculate its line integral along a given path C, with start point A 
and end point B (Figure 17). 


We can imagine approximating the path C' by a succession of straight-line 
steps, as shown in Figure 18. We do this by selecting points along the 
path, with position vectors rj,r2,...,%n+41. The sequence starts from A, 
with position vector r;, and ends at B, with position vector r;,41. We take 
a succession of straight-line steps: from r, to ro, from rg to r3, and so on, 
until we take the last step from ry, to ryn11. The line integral of F along 
the path C' is approximated by a sum of contributions from all these steps. 


Figure 18 A path approximated by straight-line steps 


Figure 19 takes a closer look at the step from r; to rj;41, which involves the 
displacement vector 
OS; = P41 — Ti. 


At the beginning of the step, the vector field has value F(r;). The 
contribution of this step to the complete line integral is given by 
multiplying the component of the field in the direction of the step 


2 Line integrals of vector fields 


(which is |F(1r;)| cos @) by the length of the step, |és;|. Using the definition 
of the scalar product of two vectors, we therefore have 


contribution of ith step = |F(r;)| cos @ x |és;| = F(r;) - 6s;, Recall that a+ b = |a| |b| cos 0. 
where the last equality follows from the definition of the scalar product. 


The value of the line integral along C' can be closely approximated by 
adding similar contributions from all the steps: 


n 
line integral ~ > F(r;) - 0s;. (15) 
i=1 
We consider this sum in the limit of an infinite number of infinitesimal 
steps. In this limit, any approximation involved in using a succession of 
straight-line steps disappears, and the sum gives the exact value of the line 
integral, which is written as 


line integral = | F -ds. (16) 
C 

Note that integral sign carries the label C’, which indicates the path 

followed. It is not safe, in general, to write the line integral as 


rB 
i, F - ds, 
rA 


because this indicates only the start and end points, and makes no 
reference to the path taken between them. In general, we need to specify 
the full path before evaluating the line integral. 


Using parametric representations 


The concept of a line integral is straightforward, but there remains the 
task of evaluating equation (16) for a given vector field F and a given 
path C. As for the line integrals of scalar functions, the key is to express 
the path in parametric form. 


Let us suppose that the path C lies in the ry-plane, and that points on the 
path have the parametric representation 


g=2(t), y=y) (sts by), 


where, as usual, t = t; refers to the start point of the path and t = tg refers 

to the end point. We consider the small displacement ds produced when 

we move from a point P with parameter value t to a neighbouring point Q 

with parameter value t + dt (see Figure 20). Q t+ ot 


An expression for this displacement was obtained in equation (3), (a + dx, y + dy) 
ds = 61+ dyj, 
so 
Beds (elt Fya) Oni oyy) Figure 20 The small 
= Fy, 6a + Fy dy. displacement from P to Q 
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Dividing and multiplying by ot then gives 


Podeara sy 
ot 
ds 
~F-— dt 
dt 
dx dy 
= (nen Z) 6 


To obtain the line integral, we take the sum of expressions like this all 
along the path in the limit of an infinite number of infinitesimal steps. 
This gives the following result. 


to dx dy 
= F, —+F, — ; 17 
I( at Fy) at a 


This formula is very important. It expresses the line integral of a vector 
field as a definite integral over the parameter t. To evaluate the definite 
integral, we must express the integrand on the right-hand side as a 
function of t. The following example shows how this is done. 


Example 3 
. Calculate the line integral of the vector field 
y F=(x—y)i+(e+y)j 
C1 along the quarter-circle path C, in Figure 21. 
C2 This path can be parametrised by the equations 
5b c=2cost, gpo=2smt (0<¢< 7/2). 
Figure 21 Two paths saute 
between the same points Differentiating the parametric equations gives 
- = —2sint, wv = 2cost. 


Expressing the components of F in terms of t gives 
F, =x — y = 2(cost — sint), 
Fy =a-+ y = 2(cost + sim). 


Hence 


ds dx dy 
= Fa 
. dt Fo + Toa 


= —A(cost — sint) sint + 4(cost + sint) cost 
= A(sin? t + cos? t) 
=4. 
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Substituting into equation (17), the required line integral is 


n/2 x/2 
[ e-=/ 4 dt = [4t] 5" = 2n. 
Ch 0 


Line integrals generally depend on the path between their start and end 
points. For example, the line integral of F over the path C2 in Figure 21 
has a different value to that calculated in Example 3, as you can confirm. 


Exercise 6 
Calculate the line integral of the vector field 
F=(c-y)it+t(e+y)j 


along the straight-line path C2 shown in Figure 21. This path can be 
parametrised by the equations 


g£=2—-i, y=t (O<t< 2). 


The method used in the above example and exercise applies to all line 
integrals of vector fields. In three dimensions, the parametric equations 
also include a function z(t), and the expression for ds/dt has a component 
dz/dt. The following procedure refers to this three-dimensional case, but it 
is readily adapted to curves in the xy-plane by omitting the terms 
involving z. 


Procedure 1 _ Finding the line integral of a vector field 


Given a vector field F = F, i+ F,j+F,k, and a path C with the 
parametric representation 
c=a(t), y=y(t), z=2(t) (i <t< ta), 
the line integral of F along the path C can be found as follows. 
1. Use the parametric representation to find the components of 
ds dz, dy, dz 


dt dt So pe. ne nS 


2. Express the components of F as functions of the parameter t. 
3. Find the scalar product 
d: dx dy dz 


Ss 
fo 2 ee 
di Wp ge ae 


as a function of t. 


4. Evaluate the line integral as a definite integral over t: 


tg 
[F-a= | ee (19) 
' ere 
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Example 4 

Evaluate the line integral of the vector field 
F=e77%i+yj+2*k 

along a path C' with the parametric equations 
g=3P, yad?, 2=5" O<t< 1). 

Solution 

Differentiating the parametric equations, we get 
dx dy _ dz 


— =6t =8t, — =10t. 
dt > dt > dt 


Expressing the components of F in terms of t gives 
ao =a, Aeavget.. f= 7 = 20, 
So 
ds 
r= 
dt 
4300". 


9t* x 6t + 16t* x 8t + 25t* x 10¢ 


Substituting into equation (19), the line integral is 


1 
[a= | 432t° dt 
Cc 0 


= 489 [115] = 72. 


Exercise 7 

Calculate the line integral of the vector field 
F=yzi+azj+cyk 

along a path C' with parametric equations 


g=t, gaHl+2t, g=4 (0<2< 1). 


If we reverse the direction of the path of a line integral of a vector field, 
tracing out the same curve but in the opposite sense, the magnitude of the 


This is different to the behaviour line integral remains unchanged, but its sign is reversed. 


of the lengths of curves and the 
line integrals of scalar functions, 


This follows directly from equation (15), which expresses the line integral 


which are unchanged by a of a vector field F as a sum of contributions of the form F(r;) - 6s;. When 
reversal of the path. we reverse the direction of the path, the sign of each small displacement 
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ds; changes, so the contribution changes to 
F(r;) 7 (—ds;) = —F(r;) 2 OS;. 


Since this is true for all contributions along the path, the line integral 
along the reverse path is minus the line interval along the original path. 


2 Line integrals of vector fields 


You can check this in the following exercise. 


Exercise 8 
Calculate the line integral of the vector field 
F=yzi+azj+ayk 
along the path Cyey, with parametric equations This is the reverse of the path C 
in Exercise 7, obtained by 
g=1-t, y=3-2t, z=4 (0<t<1). 


replacing t with 1 — ¢ in the 
parametric equations. 


The following optional box makes a link between line integrals and the 
concept of energy; this link is of fundamental importance in physics. 


Line integrals and energy 


Think of a particle of mass m, moving along a path C. For example, YA 
it could be a ball that has been thrown in the air, and follows a 
parabolic arc back to the ground (Figure 22). At each point r, the 
particle experiences a force F(r), and we can calculate the line 
integral of F(r) along the particle’s path: 


tg 
[F-a= | B.S at, (20) z 
Cc ty Figure 22. A thrown ball 


follows a parabolic arc 


where t is the parameter used to label points along the path, with 
= t, at the start point, and t = fo at the end point. 


The parameter t need not have any physical significance. However, we 
are free to choose it to be the time elapsed, which increases as the 
particle traces out its path. This choice does not affect the value of 
the line integral, but helps us to interpret its meaning. 


With ¢ interpreted as time, ds/dt is the velocity v of the particle. 
Also, Newton’s second law tells us that ‘force is equal to mass times 
acceleration’. Since acceleration is the rate of change of velocity, we 
can express this as F = mdv/dt. Putting these results together, the 
integrand on the right-hand side of equation (20) is 


ds dv 
Bee) Spe 
di dt 
= in coy La ee, 
7 Gis 0) pode eae 
d 
=m = [a(vz + vy + ¥2)] 
d 
= alam’) ; (21) 


where v = |v| is the speed of the particle, which is a function of time 
as the particle progresses along its path. 
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The final conclusion does not 
depend on the use of time as the 
parameter in the line integral. 
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Combining equations (20) and (21), we conclude that 


to d 
[Beas= [7 ZG?) at = bined — bine, (2) 
C Hy at 


where v; = v(t;) is the particle’s initial speed at the beginning of the 

path, and v2 = v(t2) is its final speed at the end. 

Scientists place a special interpretation on these results. The quantity 
smu" is called the kinetic energy of the particle, and is interpreted 

as the energy that the particle has by virtue of being in motion. If 

v2 > v, in equation (22), the particle gains kinetic energy as it moves 

along the path C. 


It is a fundamental principle of science that energy is conserved, so 
the energy gained by the particle must come from somewhere. If you 
push the particle in the absence of other forces, it comes from energy 
stored in your muscles. If the particle falls under gravity, it comes 
from a type of energy known as potential energy, which is stored in 
the gravitational field. You need not worry about the details — the 
important point is that the line integral on the left-hand side of 
equation (22) quantifies the energy transferred to the particle by the 
force F. This is a major application of line integrals. 


We note in passing that it is also possible to evaluate the line integrals of 
vector fields in non-Cartesian coordinate systems. If (u,v, w) are 
orthogonal coordinates with unit vectors e,, e, and ey, then a vector 
field F may be expressed as 


F = Fie, + Fy ey + Fy Cw. 
In terms of these coordinates, a path C’ is represented by parametric 
equations of the form 

=u), v=vwe), w=—wt) ty <t< 19), 
and the line integral of F along C' is given by the general formula 

[F soe [ (Fut “ + Fyhy “ + Fuh =) dt, (23) 


where h,, hy and hy, are the appropriate scale factors. In the special case 
of Cartesian coordinates, all these scale factors are equal to 1, and we 
recover equation (18). You will not be asked to use equation (23) in this 
module, but we quote it to give you a more complete picture, and because 
you may come across it in your future studies. 


3 Line integrals of gradient fields 


3 Line integrals of gradient fields 


You have seen how to calculate the line integral of any vector field. This 
section considers a restricted class of vector fields — those that are 
proportional to the gradients of scalar fields. The line integrals of these 
fields have a special property: they are independent of the path taken 
between their start and end points. This property is very useful in 
applications, including many that arise in physics and engineering. 


3.1 Gradient fields 


We often meet vector fields that are expressed in the form 
F = —-VU, 


where U is a scalar field. The minus sign in this equation means that the 
vector field F points in the direction in which the scalar field U decreases 
most rapidly. This is usually what is needed. For example, heat flows in 
the direction in which temperature decreases most rapidly. We retain the 
minus sign throughout our discussion because this is what you are most 
likely to meet in real-world applications beyond this module. 


A vector field F is called a gradient field if it can be expressed in 
the form 


1? = WAU, (24) 
where U is a scalar field. 


U is called the scalar potential field associated with F. 


Not all vector fields are gradient fields, and a test for gradient fields will be 
given later in this section. 


Example 5 


Show that the vector field F = xi-— yj is a gradient field with an 
associated scalar potential field U = $(y? — x”). 


Solution 


Taking the partial derivatives of U = $(y? — a”), we get 


OU nd OU 
dz By” 
sO 
Vee Oe ote 
Ox Oy 


Gradient fields have a very important property: their line integrals depend 
on the start and end points of the path, but are independent of the 
detailed shape of the path joining these points. 
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A 

B 
Figure 23 Paths from A 
to B 


224 


To see why this is so, let us consider a gradient field 
F=-—VU. 


We can take the line integral of this vector field along a path C with start 
point A and end point B. Points on this path are labelled by a 
parameter t, with t = t, at the start point, and t = tg at the end point. 
The line integral is given by 


tB tB d: 
[P-a= | pgs f yew, (25) 
. _ dt 


ta 


Now, the integrand on the right-hand side can be simplified: 


ds OU, OU, OU dx, dy, dz 
a (Si Sis ) : (Gir tis Sx) 

_ OU dx OU dy OU dz __ dU 

~ On dt Oy dt Oz dt dt’ 
where the last step uses a version of the chain rule (equation (23) of 
Unit 7). Using this result in the integral on the far right-hand side of 
equation (25), we get 


tB 
[P-a=- | OY ei (26) 
Cc t 


A 


VU: 


where Uy = U(t,g) is the value of U at the start point of the path, and 
Up = U(tg) is the value of U at the end point. 


The same answer U, — Up is obtained no matter which route is taken 
between the given start and end points. For example, Figure 23 shows 
several paths leading from A to B. The line integral of a gradient field 

F = —VU has the same value for all these paths. For a closed loop, the 
start point A is the same as the end point B, so U4 = Ug. Hence the line 
integral of a gradient field around a closed loop is equal to zero. 


A line integral that does not depend on the path between given start and 
end points is said to be path-independent. We can therefore make the 
following statements. 


e Any line integral of a gradient field is path-independent. 


e = Any line integral of a gradient field around a closed loop is equal 
to zero. 


It is easy to find the line integral of a gradient field if we know its 
associated scalar potential field, as the following example shows. 


Example 6 


The vector field F = 3x7yi+ (2° + 4y?)j is a gradient field with associated 
scalar potential field U(x, y) = 1 — x3y — y*. Find the line integral of F 
along a path C' with parametric equations 


x=1-cost, y=l1l+sint (0<t<7). 


3 Line integrals of gradient fields 


Solution 


Because F is a gradient field, its line integrals are path-independent. 
Although the path is specified in detail, only its start and end points 
matter. Substituting ¢t = 0 and t = z in the parametric equations gives 
start point (0,1) and end point (2,1). Using equation (26), we then get 


[F-as=00,1)-U@,1) =0- (1-8-1) =8 
Cc 


Exercise 9 


The vector field F = xi— yj is a gradient field with an associated scalar 
potential field U = $(y? — a’). Find the line integral of F along any 
path C that starts from (1,1) and ends at (7,3). 


3.2 Conservative vector fields 


You have seen that the line integrals of gradient fields are always 
path-independent, and this simplifies the evaluation of these line integrals. 
In order to focus on the property of path-independence, we make a 
definition. 


Conservative fields 


A vector field F is said to be conservative if, throughout its domain, 
all of its line integrals are path-independent. 


Using this definition, it is clear that all gradient fields are conservative 
fields. But can we say that all conservative fields are gradient fields? Our 
definitions do not exclude the possibility that a field could be conservative, 
and yet not be expressible in the form F = —VU. You will soon see that 
all conservative fields are gradient fields, but a little more work is needed 
to establish this fact. 


Why conservative? 


The word ‘conservative’ is used for historical reasons. An early 
application of line integrals was to calculate the energy transferred by 
a force when a particle moves from one point to another. If these line 
integrals are path-independent, it turns out that the law of 
conservation of energy can be expressed by a simple formula involving 
kinetic and potential energies. The term conservative field derives 
from conservation of energy, but is now a far more general concept. 


If we are told that a given vector field is conservative, we can often 
simplify the evaluation of its line integrals, as the following example shows. 
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7] 

R 
—R i * 
Figure 24 A semicircular 


path and its straight-line 
replacement 


YB 


YA 


ro 


Figure 25 ‘Two routes from 
Yo torgp 


The choice of rp is arbitrary, but 
this does not matter — we just 
make some definite choice. 
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Example 7 


The vector field F = (x? + y”)i+ 2xyj is conservative. Calculate its line 
integral along the red semicircular path in Figure 24. 


Solution 


We are told that F is a conservative field, so we are free to replace the 
semicircular path by a straight-line path along the x-axis, shown in blue in 
Figure 24. Along this straight-line path, the component of F in the 
direction of the path is F, = 2? + 0 = x?, so the line integral is 


R R R 
/ Fede = [ gz? dx = [327] p= 2R°. 
—R —R 


This is equal to the required line integral because the blue and red paths 
share the same start and end points, and the field is conservative. 


Exercise 10 


The vector field F = 3a27yi + (x? + y?)j is conservative. Find the line 
integral of F along any path that starts at the origin and ends at (1,2). 


But be very careful: you cannot adjust the path of a line integral of a 
vector field unless you know that the field is conservative. If the field is 
not conservative, changing the path will generally give the wrong answer! 


We can use an alternative notation for the line integrals of conservative 
fields, reflecting the fact that they are path-independent. The line integral 
of the conservative field F along a path C with start point A and end 
point B can be written in any of the forms 


[Pea | Peds = [ F - ds, 
Cc A>B rA-rp 


where ry, and rp are the position vectors of A and B. The notations used 
on the right have the advantage of explicitly indicating the start and end 
points without giving irrelevant information about the precise shape of the 
path. Using this notation, and referring to Figure 25, it is easy to see that 
for any conservative vector field F, 


| Peds = [ Feds+ [ F - ds. 
rors ro7r, ra—rgs 


This expresses the fact that the line integral from rg to rg is the same 
whether the route goes via ry or not. 


(27) 


We now show that any conservative field F must also be a gradient field. 
We do this by constructing a scalar function U(r) for which F = —VU. 
This is done by first choosing a fixed reference point ro at which 

U(ro) = 0, and then defining 


U(r) = -[ F «ds. (28) 


3 Line integrals of gradient fields 


For a conservative vector field F and a fixed reference point ro, the line 
integral on the right-hand side depends only on the end point r. So U(r) is 
a well-defined function, which is equal to zero at r = rg. Because the line 
integral involves a scalar product, U(r) is a scalar quantity — in fact, it is 
the scalar potential field associated with F, as we will now show. 


When we move from a point r4 to another point rg, equation (28) tells us 
that U changes by 


U(en)-U(ea) = - f F-ds+ [ F - ds. 
o7rg ro7r, 


r 


Using equation (27), this is equivalent to 


U(rg) — U(r,) = -{ F -ds. (29) 
ra—rg 

Now let us take ry = (2, y,z) andrg = («+ 6z,y,z), where oz is a tiny 

increment in x. Then the line integral on the right-hand side of 

equation (29) can be taken parallel to the z-axis. If dx is very small, we get 


U(a + 62, y,z) —U(a2,y, 2) ~ —F,(2, y, z) ou. For a point r = (x,y,z), we use 
the notations U(r) and 


Then dividing both sides by 6a and taking the limit as 6x” tends to zero, we U(a,y, z) interchangeably. 


get 
OU _ 


a 
Similar results for OU/Oy and 0U/0z are obtained by making tiny 
displacements in the y- and z-directions. Collecting these results together, 
we conclude that 


OU E OU F OU 


FE. =e =e =_-- 
’ Ox’ ¥ Oy’ - Oz’ 
or in vector form, 
F=-VU. 


This shows that any conservative field is a gradient field. We already know 
that any gradient field is a conservative field, so we reach the following 
memorable and important conclusion. 


The terms conservative field and gradient field are synonymous and 
can be used interchangeably. 


Given a conservative field F, we often want to find the corresponding 
scalar potential field U. This is useful because once we know U, we can 
easily evaluate any line integral of F using equation (29). To obtain a 
formula for U, we can use equation (28) with any convenient choice of path 
for the line integral. This technique is illustrated in the following example. 
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Example 8 


A conservative vector field takes the form F = ry?i+ x2yj. Find the 
associated scalar potential field U(x, y), taking U = 0 at the origin 0. 
Solution 


From equation (28), the scalar potential field is given by 


Ua)=- [Feds 


Because the vector field is conservative, we can evaluate this line integral 
os | over any convenient path. Let us consider an arbitrary point (a,b). We 
a cara” ' choose a straight-line path from the origin to this point, as shown in 
Figure 26. 


This path can be described by the parametric equations 
t=0,. y= (0575 1): 


t=0 ; x 

Figure 26 A straight-line 

path from the origin to (a,b) | The values of a and b are constant along the path, so 
dx 


a =a and “eo 


Hence 


F. “ = (at) (bt)2a + (at)2(bt)b = 2020743 


and 


t=1 

U(a,b) = -{ 2a7b*t? dt = — 50707. 
t=0 

However, the point (a,b) is arbitrary, so for any point (x,y), we conclude 

that the scalar potential field is 


U(2,y) = —327y?. 
This answer can (and should) be checked by taking its gradient: 
VU =-«y*i- «yj =—-F, 


as required. 


Exercise 11 


A conservative vector field takes the form F = cosri+sinyj. Find the 
associated scalar potential field U(x, y), taking U(0,0) = 0. 


3.3. The curl test 


This subsection gives a test that allows us to decide whether or not a given 
vector field is conservative. We start by noting that if F is a conservative 
vector field, then it is also a gradient field and can be written in the form 


F==VU. 
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Taking the curl of both sides of this equation, we get 


ij ok 
a a a 
VxF=-Vx(VU)=-|d2 dy 2@:2|. 
au au au 
Ox Oy Oz 


Expanding the determinant on the right-hand side in the usual way, and 
using the mixed partial derivative theorem of Unit 7, we get the zero 
vector 0 = (0,0,0). For example, the z-component is This calculation was done in 


Oo (OU a 0 (AU = Exercise 27 of Unit 9. 
Oy \ Oz Oz\ Oy) 


So any conservative field has zero curl throughout its domain. 


In most circumstances, the converse statement is also true; so if a vector 
field F has V x F = 0 throughout its domain, then F is conservative. We 
do not have the tools to prove this yet, but a proof is given at the end of 
the unit. Taking the result on trust, and assuming that all the necessary 
conditions are met, leads to the following curl test. This is the normal way 
of deciding whether or not a given vector field is conservative. 


Curl test for conservative fields Strictly speaking, the curl test 
. 7 assumes that the domain of F is 
To test whether the vector field F is conservative, evaluate V x F. ‘simple’ in a certain sense 


If V x F = 0 everywhere in the domain of F, then F is conservative. (technically, ee : 
simply-connected). We will 


Otherwise, it is not conservative. return to this point in 
Subsection 5.3. 


Example 9 
Determine whether or not the following vector fields are conservative. 


(a) F=yzi+azj+ayk (b) G=27i+2?jt+y’k 


Solution 
(a) We have 
ij k 
Qo @ @ . ' = 
VxF=|5 5 5 )=i@-2)-iu-v+k@-2=0, 


YZ UZ xy 


So the curl test shows that F is conservative. 


(b) We have 
i 7 & 
0 0 O : ‘ 
VxG= Se: Oe. Be = i(2y) — j(—2z) +k (2z). 
ge gt 2 


This is not equal to O everywhere, so G is not conservative. 
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Figure 27 A planar element 
and its unit normal n 


Figure 28 Some oriented 
areas 
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Exercise 12 
Determine whether or not the following vector fields are conservative. 


(a) F=yi+a2j+zk (b) G=-yi+a2j+zk 


4 Flux and the divergence theorem 


This section introduces the concept of the flux of a vector field, which is a 
type of surface integral. Unit 8 showed you how to integrate scalar 
functions over surfaces. Here, we explain how to calculate surface integrals 
of vector functions. These integrals are found throughout science and 
engineering. For example, they are used to calculate the amount of air 
flowing out of a given region or the rate at which heat energy is lost 
through the walls, roof and ground floor of a house. They are also 
important in electromagnetism. 


The concept of flux allows us to quantify the extent to which a vector field 
diverges outwards from a given point. Once this has been understood, we 
can return to a major theme of Unit 9 — divergence and its interpretation. 
An important result called the divergence theorem will cast further light on 
the meaning of divergence. 


4.1 Flux over a planar surface element 


You probably think of area as a scalar quantity — a certain number of 
square metres or square inches, say. This is fine for areas drawn on a sheet 
of paper. More generally, we can consider a planar element that is oriented 
in three-dimensional space. Such an element is characterised by its area 6S 
and its orientation in space. 


The simplest way of describing the orientation is to specify a unit vector n 
that is perpendicular to the surface of the element (Figure 27). There are 
actually two vectors that could be used for this purpose, pointing in 
opposite directions. This is not a problem — we just pick one of these 
vectors and specify our selection clearly. The chosen unit vector is then 
called the unit normal of the planar element. 


Information about the area of the element and its orientation can be 
combined. We define the oriented area of the element to be 


5S = 5S fi. (30) 


This is a vector quantity. Its magnitude is the (scalar) area of the element, 
6S, and its direction gives the direction of the unit normal to the element. 
Figure 28 shows the oriented area vectors of some planar elements. The 
magnitudes of these vectors are larger for elements of larger area, and this 
is indicated by the relative lengths of the arrows. 


4 Flux and the divergence theorem 


Now consider a fluid such as water or air moving in three-dimensional 
space. We can imagine a tiny planar element inside this fluid. This 
element is a mathematical construction rather than a tangible object, so it 
does not interfere with the flow of the fluid. We can then ask: how much 
fluid passes through the planar element per unit time? 


In the simplest case, the planar element is perpendicular to the flow of the 
fluid, with its unit normal n in the same direction as the fluid flow, as in 
Figure 29(a). If the fluid has velocity v and speed v = |v|, then the volume 
of fluid passing through the element in a small time dt is equal to the 
volume of the red box in Figure 29(a). This box has length v dt and 
cross-sectional area 6, so the volume of fluid passing through the element 
in time dt is 


dV = vot dS. 


Now let us see what happens when the planar element is not perpendicular 
to the flow. Figure 29(b) shows a case where the unit normal n makes an 
angle @ with the velocity vector v of the fluid. In this general case, the 
volume of fluid passing through the element in time ot is equal to the 
volume of the red skewed box in Figure 29(b). 


v dt cos 6 
(a) (b) 


Figure 29 Flow through planar elements that are: (a) perpendicular to 
the flow; (b) not perpendicular to the flow 


However, geometry tells us that the red skewed box has the same volume 
as the black box in Figure 29(b), so its volume is 


dV = (vdtcos@) dS = (ucos@) dt dS. 


The quantity vcos@ that appears in this equation is the component of the 
fluid velocity in the direction of the unit normal, since 


n-v = |n||v| cos @ = vcosé. 
Hence 


6V = (f- vv) ot 6S. 
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Using the definition of the oriented area of a planar element in 
equation (30), this can also be written as 


5V = (v- dS) ot. 


Dividing both sides by dt, and taking the limit as dt tends to zero, we see 
that the rate of flow of fluid volume through the planar element is 


dV 


The right-hand side of equation (31) is what we have been building 
towards. The quantity v- 0S is called the flux of v over the planar element 
with oriented area 6S, and we have shown that this is equal to the rate of 
flow of fluid through the element. 


More generally, we can define the flux of any vector field. 


Flux of a vector field over a planar element 


Given any vector field F and a planar element with oriented area 6S, 
the flux of the vector field over the element is defined as 


flux = F - 6S = F 6S cos8@, (32) 


where the field F is evaluated at the position of the element, and @ is 
the angle between the directions of F and the unit normal ni to the 
planar element. 


We have 
F- 6S = (n- F)6S. 


Since n- F is the normal component of the field, we can also state the 
following. 


The flux of a vector field over a planar element is the normal 
component of the field multiplied by the area of the element. 


Flux is a scalar quantity, which can be positive, negative or zero 
depending on the relative orientations of F and the unit normal n. 


The word ‘flux’ derives from a Latin word for flow, and you have seen that 
it allows us to describe the rate of flow of a fluid through a planar element. 
However, equation (32) defines flux for any vector field. When describing 
electric and magnetic fields, for example, we may be interested in their 
fluxes, even though these fields do not actually flow. 


Exercise 13 


The figure in the margin shows two planar elements A and B, with their 
unit normals indicated. Find the flux of the vector field F = 2i + 3j + 5k 
over both of these elements. 


4 Flux and the divergence theorem 


4.2 Flux over an extended surface 


In general, we are interested in the flux of a vector field over an extended 
surface, which may be curved. This can be found by integrating over the 
surface, as we now explain. 


Figure 30 shows a vector field F and an extended curved surface S. The 
vector field, and its angle to the normal to the surface, may vary from 
point to point. However, we can imagine dividing the surface into many 
tiny elements, each of which can be approximated by a tiny planar area 
element, with oriented area 6S;, at position r;. For each planar element, 
we have a choice of two directions for the unit normal. We require these 
choices to be made consistently so that neighbouring elements have unit 
normals that are nearly parallel rather than nearly antiparallel. 


The flux of F over the extended surface S' is then approximated by 
flux over S ~ S— F(ri) - Si, 


vu 
where the sum is over all the planar elements approximating the surface. 
We take the limit where the size of each element tends to zero and the 
number of elements tends to infinity. In this limit, the sum is written as 


flux over S = | F(r) - dS, (33) 
S 


where the expression on the right-hand side is called the surface integral 
of F over the surface S.. We will explain how to evaluate this surface 
integral shortly. 


First, it is important to distinguish between two types of surface. An open 
surface has at least one boundary curve marking the furthest extent of 
the surface; an example is shown in Figure 30. A closed surface has no 
boundary curves, and divides three-dimensional space into two parts: the 
space inside the surface and the space outside the surface. (The shell of an 
egg forms a closed surface until you break it open.) In the case of an open 
surface, there are two choices for the set of unit normals, and it is 
necessary to state which choice has been made by, for example, drawing a 
diagram showing the unit normals at a few points. For any closed surface, 
a standard convention is used. 


Unit normal convention for closed surfaces 


For any closed surface, all the unit normals are chosen to point 
outwards into the exterior space, rather than inwards towards the 
enclosed volume (see Figure 31). 


This convention has an important consequence for any vector field that 
represents a flow. If the flux of the field over a closed surface is positive, 
the net flow is outwards; if the flux is negative, the net flow is inwards. 


Figure 30 A vector field F 
and an extended surface S' 


Note that the notation dS is 
used in surface integrals, while 
ds appears in line integrals. 


\3 
[ 


ay \ \ ye 
a 


a 


Figure 31 A closed surface 
and its outward-pointing unit 
normals 
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(u,u + dv) 


(u + du, v) 
(u,v) 


Figure 32. A patch 
generated by tiny increments 
in u and v 
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We now turn to the calculation of surface integrals of vector fields. 
Fortunately, the key ideas were introduced in Section 5 of Unit 8 in the 
context of finding the area of a curved surface. 


The first idea is to label points on the surface by a pair of parameters, 
(u,v). For example, points on the surface of a sphere, centred on the 
origin, can be labelled by the angular coordinates (6,6) of a spherical 
coordinate system. The radial coordinate r takes a constant value R all 
over the surface of the sphere. Since r does not distinguish different points 
on the surface, it is not regarded as one of the parameters that label points 
on the surface. 


Points on the surface can also be labelled by their Cartesian coordinates 
(x,y,z), and there is a set of equations linking (2, y, z) to (u,v) for any 
point on the surface: 

2=nlu,e), yvH=ylte), <= 2(u;:v). 
On a spherical surface of radius R, for example, the relevant equations are 


z=Rsinécos¢, y=Rsindsingd, z= Rcos. 


This is reminiscent of the parametric description of a curve, but while each 
point on a curve is labelled by a single parameter t, each point on a curved 
surface is labelled by two parameters, (u,v). 


When uw and v increase by tiny amounts du and 6v, a tiny patch is 
generated on the surface, as shown in Figure 32. Unit 8 showed that the 
area of such a patch is 


6S = |J| du du, (34) 
where J is the Jacobian vector, given by 
i j k 
dx dy dz 
J=|du du dul- (35) 
Ox Oy Oz 
Ov Ov Ov 


Unit 8 noted that this vector is perpendicular to the surface. We can 
always choose the order of the parameters u and v to ensure that J is 
parallel to the chosen unit normal ni of the surface. Assuming that this has 
been done, the oriented area of the tiny surface patch is simply 


6S = J dud, 
and the flux of a vector field F over the patch is given by 
F-d0S=F-Jduov. 


The flux over the entire surface S is found by adding up contributions like 
this from each of its patches. In the limit where the patches shrink to zero 
size, the sum becomes an integral over suitable ranges of u and v. 


4 Flux and the divergence theorem 


Evaluating the flux of a vector field over an extended surface 


If points on a surface S' are parametrised by (u,v), where uy <u < ue 
and vy <v < vg, then the flux of a vector field F over the surface is 
given by the integral 


Vv=v2 u=u2 
[es= | ( F-Jdu) dv, (36) 
S V=U1 Uu=uU1 


To evaluate this integral, the integrand F - J must be expressed in 
terms of the parameters u and v. 


To avoid duplicated effort, we skip the step of calculating J from the 
determinant in equation (35). You have had practice at doing this in 

Unit 8, and there is nothing to be gained by going through similar steps 
again. To understand how equation (36) is used, it is sufficient to look at 
surfaces that are spheres or portions of spheres. The Jacobian vector J for 
a spherical surface was calculated in Unit 8, and we quote that result here 
for ease of reference. 


The Jacobian vector J on the surface of a sphere 


On the surface of a sphere of radius R, centred on the origin and 
parametrised by (0, ¢) of spherical coordinates, 


J =F sinfe,, (37) 
where 
e, = sin@cos i+ sin Asin dj + cos 6k (38) 


is the radial unit vector of spherical coordinates. 


A simple but important case arises when a vector field expressed in 
spherical coordinates takes the form 


A 
P= 3° sO), 


where A is a constant. To calculate the flux of this field over a spherical 
surface of radius R, centred on the origin, we set r = R in the expression 
for F and take J from equation (37). This gives 


A 
F-J= (z er) . Ue sin 0 e,) = Asin#@. 
The flux of F over the surface of the sphere is then given by 


o=20n 6=1 
[Pas=a ( sind) do 
Ss o=0 6=0 


o=20 
=A [- cos 0] ao dod = 4rA. 
o=0 


We assume that the limits of 
integration, u1, U2, v1 and v2, 
are all constants. 


The calculation of J for a 
spherical surface is in 
Exercise 26 of Unit 8. 


This vector points radially 
outwards away from the centre 
of the sphere. 


Since e,. is a unit vector, 
e,:e, = 1. 
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Remarkably enough, this flux does not depend on the radius of the sphere. 
This can be understood with very little calculation. The outward normal 
component of the field has the constant value A/R? all over the surface of 
the sphere. Hence all surface elements of the same area make the same 
contribution to the surface integral. Under these circumstances, the total 
flux of F over the spherical surface can be found by multiplying the 
constant outward radial component of the field by the surface area of the 
sphere. This gives A/R? x 47R? = 477A, in agreement with our more 
explicit calculation. Shortcuts like this are useful when the normal 
component of the field is constant at all points on the surface. 


Example 10 


Calculate the flux of the vector field F = xi+ yj over the surface of a 
sphere of radius R, centred on the origin. You may use the standard 
integral 


| sin’ 0d0 = 4. 
0 


Solution 
The coordinate transformation equations for spherical coordinates are 
xz=rsin@Ocos¢, y=rsindsing, z=rcosé. 


On the surface of the sphere we have r = R, so on this surface the vector 
field is 


F = Rsin@cos ¢i+ Rsin@sin dj 
= Rsin6@ (cos ¢i+ sin¢)). 
The Jacobian vector on the surface of the sphere is 
J = R’ sinde,, 
so 
F-.J = R’ sin? 6 (cos¢i+sindj)-e,. 
Using equation (38), we get 
F - J = R®sin® 6 (cos? ¢ + sin? ¢) = R? sin® 6. 


To find the flux of F over the surface of the sphere, we integrate F - J over 
the ranges 0 < 0 <7 and 0 < @ < 27 that cover the sphere: 


p=20 6=1 
[e-s= | (| sn’ 69) dd. 
S ¢=0 6=0 


Using the standard integral given in the question, we conclude that 


p=20 
[es | $ do = $rR°. 
Ss o=0 


4 Flux and the divergence theorem 


z 
Exercise 14 : 


Calculate the flux of the vector field F = zk over the curved surface of a 
hemisphere of radius R shown in the margin, with its unit normals in the 
sense marked. (The flat base of the hemisphere is not included.) 


Exercise 15 


Calculate the flux of the vector field F = 3k over the same hemispherical 
surface as in Exercise 14. 


4.3 Divergence revisited 


Unit 9 introduced the important concept of the divergence of a vector field: 
OF, . OF, . OF, 
Ox Oy Oz 


We claimed that this gives a measure of the extent to which F diverges or 
flows away from any point. Various examples were used to illustrate this 
claim, but no proof was given. The concept of flux allows us to quantify 
this idea. If we surround a given point P by a tiny closed surface, the flux 
of a vector field F over that surface gives us a measure of the flow of the 
field away from P. According to Unit 9, there should be a link between 
this flux and the divergence of the field at P. We now investigate this link. 


(39) 


Some discussion is needed to reach the main result — the divergence 
theorem. You should follow this discussion in outline to ensure that 
you understand the main ideas, but you will not be asked to reproduce 
the steps. The most important point is the divergence theorem itself 
(equation (43)) and its applications (e.g. Examples 11 and 12). 


First, we choose a surface over which to calculate the flux. This choice will 
not affect our conclusions, but the working is simplified by using the 
surface of a tiny cube with sides of length dL, whose faces are aligned with 
the x-, y- and z-axes (see Figure 33). To find the flux over the surface of 
this cube, we must calculate the fluxes over each of its six faces and add 
them together. We take the faces in pairs, starting with the two shaded 
faces in Figure 33, which are perpendicular to the x-axis. 
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Figure 33. An enlarged view of a tiny cube with sides of length dL 
aligned with the x-, y- and z-axes 


The faces are labelled A and B; the left-hand face is at x = x4, and the 
right-hand face is at x = xg. It is important to recall that the unit 
normals of a closed surface always point outwards into the exterior space. 
This means that the unit normal for face A points in the negative 
x-direction and is equal to —i, while the unit normal for face B is +1. 
Consequently, the flux of F over face A is 


flux over A = | F-dS = | (—F, (xa, y, 2)) dy dz, 
A A 


and the flux over face B is 


flux over B = | F-dS = | (+F, (xp, y, 2)) dy dz. 
B B 


Because the cube is aligned with the coordinate axes, the ranges of 
integration over y and z are identical in both these integrals. This allows 
us to express the total flux over both faces as an integral over the x- and 
y-values associated with face A: 


fux over (4+.B) = f (Felep,y,2) — Fe(wa,u.2)) dy de 
A 


Now, the integrand can be simplified. Dividing and multiplying by 

tp — x, = OL and assuming that the cube is very small, we get 

Fy (xp, Y, z) _ Fe (ZA, Y, z) 

a 
LB-LA 


F, (zp, y,Z) — F2(@a,y, 2) = tt) 


We therefore obtain 


Fx 
flux over (A+ B) ~ i (3 sL) dy dz. 
A Ox 


Because the cube is assumed to be very small, the integrand can be taken 
to be constant over face A. The integral is then just the product of the 
integrand and the area (6L)? of the face. We therefore conclude that 


(61) = SE oy, (40) 


Ox 
where dV is the volume of the cube. 


flux over (A+ B) ~ 
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There is nothing special about the x-axis. If we consider the pair of faces 
perpendicular to the y-axis, we get a similar result with x replaced 
everywhere by y. And if we consider the pair of faces perpendicular to the 
z-axis, we again get a similar result with x replaced everywhere by z. The 
total flux of F over the entire surface of a small cube is found by adding 
these three contributions together: 


OF, OF, oF) 


— 
Ox 7 Oy Oz 
Using the definition of divergence in Cartesian coordinates in 
equation (39), we see that 


flux over surface of cube ~ V - F oV. (41) 


This is a remarkable result. It establishes the link between divergence and 
flux. All the approximations made in deriving it become exact in the limit 
as the volume of the cube shrinks to zero. The result has been derived in a 
special case, but it is true for all tiny elements of volume no matter what 
their shape. This allows us to think about divergence in a new way. 


flux over surface of cube ~ ( 


Divergence as flux per unit volume 


The divergence of a vector field F at a given point is related to the 
flux of F over a tiny surface enclosing the point. In the limit where 
the surface area and its enclosed volume shrink to zero, we have 


V-F- flux of F over surface (42) 
volume enclosed by surface 
So the divergence of a vector field at any point can be interpreted as 


the flux per unit volume at that point. 


4.4 Additivity of flux and the divergence theorem 


The interpretation of divergence in equation (42) involves the limit of a 
tiny surface surrounding a point. With a little more effort, we can get a 
more powerful result — the divergence theorem — that applies over extended 
surfaces and is very useful in applications. To achieve this, we need to 
establish a rule that allows fluxes to be added together. 


The additivity of flux 


Suppose that a given volume is subdivided into smaller volume elements. 
Then the additivity of flux relates the flux of a vector field over the surface 
of the whole volume to its fluxes over the surfaces of the volume elements. 


The additivity of flux 


If a volume is subdivided into smaller volume elements, the flux of a 
vector field over the surface of the whole volume is the sum of its 
fluxes over the surfaces of all the volume elements. 
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ree 


volume AV, 
surface Sy volume AV, 
surface So 


Figure 34 Neighbouring 
volume elements 


This is the key result of this 
section. It links surface integrals 
to related volume integrals. 
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To establish this fact, consider two neighbouring volume elements AV, and 
AV, with surfaces S$; and Sz (Figure 34). The surfaces S; and S2 share a 
common boundary wall (shown in green). At any point on this boundary 
wall, the unit normal of S$; points in the opposite direction to the unit 
normal of 52. This is because the unit normals of a closed surface always 
point outwards, away from the enclosed volume. It follows that the flux of 
a vector field F contributed by the boundary wall section of S$ is equal in 
magnitude and opposite in sign to the flux contributed by the boundary 
wall section of Sz. So when we add up the fluxes of F over the surfaces of 
all the volume elements, the contributions from the boundary walls all 
cancel out. The only surviving contributions come from the external 
surfaces, which form the surface of the whole volume. 


The divergence theorem 


Finally, we combine the additivity of flux with the interpretation of 
divergence as flux per unit volume. Suppose that we want to find the flux 
of a vector field F over the surface S of a region V (which need not be 
small). Then we can divide the region into tiny subregions, with 

surfaces S;. The additivity of flux tells us that 


, F-dS~ S| (flux over Sj), 
S i 


where the sum is over the surfaces of all the volume elements that make up 
the region. 


The volume elements are assumed to be tiny, so we can use equation (41) 
to express each flux in terms of divergence. This gives 


[¥-8=Dv-Fa eM, 
s i 


Taking the limit of an infinite number of infinitesimal volume elements, the 
approximations become exact, and we conclude that 


[e-as= [ v-Pav 
Ss V 


This is the celebrated divergence theorem. 


Divergence theorem 


Given a vector field F and a closed surface S enclosing a volume V, 
the divergence theorem states that 


[s- [ v- Rav. (43) 


In other words, the surface integral of F over a closed surface is equal 
to the volume integral of V - F over the interior of the surface. 


4 Flux and the divergence theorem 


It is easy to remember where the symbol V goes. Divergence involves 
spatial derivatives, so its units are those of the field divided by length. To 
get the same units on both sides of equation (43), the divergence must be 
in the volume integral, rather than in the surface integral. 


Origins of the divergence theorem 


The divergence theorem is frequently called Gauss’s theorem. In 
fact, it was discovered independently by several people: Joseph-Louis 
Lagrange in 1764, Carl Friedrich Gauss in 1813, George Green in 1828 
and Mikhail Ostrogradsky in 1831. 


The first two did not publish the theorem, but kept it in their 
personal papers. Green was an amateur mathematician with no 
connections to the academic world, and he published his findings in 
an obscure pamphlet. It was not until the 1830s that the theorem 
became well known, thanks to its applications in the newly-developing 
sciences of fluid mechanics and electromagnetism. 


The divergence theorem allows us to convert tricky surface integrals into 
easier volume integrals. The following example illustrates this application. 


Example 11 


Use the divergence theorem to calculate the surface integral of F = 12zk 
over the surface S of the rugby ball in Figure 35. The volume of this rugby 
ball was found to be 47a?b/3 in Exercise 19 of Unit 8. 


Solution 


The divergence of F is 


_ A(0) . A(0) . (122) 
VS eg 


= 12. 


The divergence theorem allows us to express the required surface integral 
as a volume integral of V - F over the volume V of the rugby ball. Using 
the result given in the question, we get 


[es= [ v-Pav 
Ss V 
= [ wav 
V 


=12~x 4na°b = 167a?b. 


With a slight modification of this technique, we can convert difficult 
surface integrals into easier ones. 


Figure 35 The surface of a 
rugby ball 


241 


Unit 10 Integrating scalar and vector fields 


z 
R 


Figure 36 A hemispherical 
surface with its circular base 
in the ry-plane, centred on 
the origin 
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Example 12 


Use the divergence theorem to find the flux of F = 3k over the curved 
dome of the hemisphere in Figure 36, with unit normals as shown. 


Solution 


In order to apply the divergence theorem, we need a closed surface S. We 
take this to be the whole surface of the hemisphere, including the curved 
dome 5S; and the flat base So, so 


| ¥-«s= F-ds+ | F -dS, 
Ss Sy So 


where the normals of 5; and S2 both point outwards. 


The field F is constant, so its divergence is equal to zero everywhere. The 
divergence theorem then tells us that 


[ B-as=0, 

S 

| P-ds+ / F-dS = 0. 
Sy S2 


Now, the surface integral over the flat base So is trivial. Because the 
field F is constant, and is perpendicular to this surface, we have 


sO 


| F - dS = —3 x (1R’) = —30R?. 
So 


Here, the minus sign arises because the unit normals on the flat base point 
in the opposite direction to the field. Hence the required surface integral is 


| F - dS = 3nR’, 
Si 


which agrees with the answer to Exercise 15. 


Exercise 16 


Use the divergence theorem to calculate the surface integral of the vector 
fied F = xi+ yj+ zk over the surface S' of a sphere of radius R. 


Exercise 17 


Use the divergence theorem to calculate the surface integral of the vector 
field F = zk over the curved part of the hemispherical surface in Figure 36. 


4.5 The equation of continuity 


This subsection illustrates an important application of the divergence 


theorem, but will not be assessed. You can skip it if you are short of 
time. 


4 Flux and the divergence theorem 


Many quantities are neither created nor destroyed, but just move around 
from place to place. Such quantities are said to be conserved. For 
example, to a good approximation there is a fixed amount of air in the 
atmosphere. The air can move, and its local density may vary, but the total 
mass of air remains constant. We say that the mass of air is conserved. 
Something similar can be said about electric charge and energy. The 
divergence theorem allows us to express such behaviour in a precise way. 


To take a definite case, we consider a fluid of constant total mass. At each 
point r and time ¢, the fluid is described by its velocity v(r,t) and its 
density p(r,t). We consider a fixed surface S enclosing a volume V. The 
total mass of fluid inside this surface is 


M= [ pav. 
V 


As the fluid moves around, the mass of fluid contained in the region V may 
change, and the rate of change of the enclosed mass is 


dM d 
—_ = — dv. 44 
dt dt [ie i 
Naturally, we assume that there is no spontaneous creation or annihilation 
of fluid, so any change of fluid mass in the region V must be caused by a 
flow across the surface S. A net inward flow produces an increase in local 
mass, while a net outward flow leads to a decrease. 


Equation (31) tells us that the rate of flow of fluid volume across a tiny 
planar element is v - 6S, where 0S is the oriented area of the element. The 
corresponding rate of flow of fluid mass is (pv) - 6S. Hence the rate of flow 
of fluid mass out of the region V is given by the surface integral 


rate of outflow = [ow -ds. 
Ss 


Because the unit normals of a closed surface point outwards, this is the 
rate of flow of fluid mass out of the region V, and is equal to —dM/dt, the 
rate of loss of mass from the enclosed region V. We therefore have 


a = [om - dS. (45) 


Comparing equations (44) and (45), we conclude that 


-5 | vay = | (ov) as. (46) 


This equation expresses the fact that any change in fluid mass in a region 
is related to a flow into or out of that region. The focus of interest here is 
that it can be recast in an alternative form using the divergence theorem. 


First, the differentiation on the left-hand side can be brought inside the 
integral. This is allowed because the region of integration V does not 
change with time, so any change in the integral must be due to a change in 
its integrand. For any given region, the integral depends only on t, which 
is why straight dees have been used in equation (46). However, the 
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The vector field pv is sometimes 
called the mass flux density, and 
given the symbol J. 
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density p can depend on both ¢ and r, so we must use the curly dees of 
partial differentiation inside the integral. Thus 


d Op 
= dV = — dV. 
al? . if 


We can also use the divergence theorem to express the right-hand side of 
equation (46) as a volume integral: 


[(ow)-a8= [ v-(ovjav. 


Equation (46) can therefore be written as 


I ($ EVs (ov)) dV =0, (47) 


where the two volume integrals over V have been combined into a single 
integral. 


When a definite integral is equal to zero, it is normally unsafe to argue 
that its integrand must be equal to zero — there could, after all, be 
cancellations of positive and negative contributions. However, we are in a 
different position. Because equation (47) is valid for any region of 
integration, no matter how small or where it is located, the only possibility 
is for the integrand to be equal to zero everywhere. This leads to the 
following conclusion. 


Equation of continuity 


If the mass of a fluid is conserved, then at each point, its density p 
and velocity v are related by 


Op Bl 
BE +V-(pv) =0. (48) 


This is known as the equation of continuity for fluid mass. 


Equations like this appear throughout physics — wherever a quantity that 
flows like a fluid is conserved. For example, the flow of electric charge and 
the flow of energy both obey equations of continuity. You will meet this 
equation again in Unit 12 when we discuss diffusion. 


We are sometimes interested in steady-state situations where the density p 
does not change in time at any point in the fluid. In this case, the equation 
of continuity gives V - (pv) = 0. This restricts the possible flows that can 
occur in steady-state situations. 


Exercise 18 


Which of the following vector fields could describe pv in a steady-state 
flow in a fluid? 


(a) pv = 2y2i- l4yzj+7z2k  (b) pv = 2ri-3yj+4zk 


5 Circulation and the curl theorem 


5 Circulation and the curl theorem 


This section brings our discussion of the calculus of fields to a close by 
investigating the meaning of the curl of a vector field. Its main result is 
the curl theorem, which is the counterpart of the divergence theorem of the 
previous section. 


Recall that Unit 9 introduced the curl of a vector field as 


; oo 
0 dO Od 
F, Fy, F; 


This is a vector field. We claimed that the curl vector measures the extent 
to which F is associated with rotation or swirling. A few examples 
supported this claim, but no proof was given. In this section, we use line 
integrals around closed paths to quantify the concept of curl. 


5.1 Circulation of a vector field 


5) 


We begin by establishing a convention. Figure 37 shows a planar surface 

element, with unit normal n. The perimeter of this element is a closed 

loop C, which we would like to treat as a path with some sense of positive 

progression. If the surface element were in the xy-plane, viewed from 

above, we might talk about progression in a clockwise or anticlockwise Figure 37 A planar surface 
sense, but terms like this become ambiguous for planar elements with element 

arbitrary orientations, viewed from arbitrary directions. 


There are two possible choices for the unit normal of a planar element. 

A particular choice has been made in Figure 37. Having made this choice, 
we now fix the sense of positive progression around the perimeter curve C 
by the following convention. 


5) 


Right-hand grip rule 


See Figure 38: with the thumb of your right hand pointing in the 

direction of the unit normal of a planar element, the curled fingers of 

your right hand indicate the sense of positive progression around the 

perimeter of the element. Figure 38 The right-hand 
grip rule 


Exercise 19 
The arrows in the figure in the margin show senses of progression around 


the perimeters of four shaded patches A, B, C and D on the surface of a 
sphere. For which of these patches do the arrows indicate a sense of 


positive progression? D) 
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Given a planar element 6S, we can define a closed path C that goes once 
round the perimeter of the element in the positive sense defined by the 
right-hand grip rule. Around this path, we can evaluate the line integral of 
a vector field F. This line integral is called the circulation of the vector 
field around C’. We write 


circulation = / F-ds= f F - ds. 
C Cc 


The last expression has a circle in the middle of the integral sign. This 
symbol is sometimes used as a reminder that the path of integration is 
closed, but otherwise makes no difference to the meaning. 


Circulation of a vector field 


Given a vector field F and a closed path C, the circulation of F 
around C’' is given by the line integral 


circulation = ¢ F - ds. (50) 
C 

If C is a path around a planar element with a given unit normal, it is 

understood that C is traversed in the positive sense determined by 

the right-hand grip rule. 


Section 2 explained how to calculate line integrals of this type. There is 
nothing new here except that the path is closed, which means that the 
start and end points of the path are identical. So if the path is described 
by the parametric equations 


f=2); v=), s=2) sis ty), 


then the extreme parameter values ¢; and tg refer to the same point. 


Exercise 20 


(a) Calculate the circulation of the vector field F = xj around a closed 
path C with parametric representation 


z(t) =cost, yt)=—smt (0 <¢ < 27). 


(b) What is the circulation of a conservative vector field G around the 
closed path C? 


5.2 Curl revisited 


Circulation measures the amount of rotation or swirling associated with a 
vector field. You might therefore expect there to be a link between 
circulation and curl, similar to the link between flux and divergence. This 
is indeed the case, and we will now explore the nature of this link. 


5 Circulation and the curl theorem 


The following argument justifies the main result — the curl theorem. 
You should follow this discussion in outline to ensure that you 
understand the main ideas, but you will not be asked to reproduce 
the steps. The most important point is the curl theorem itself 
(equation (57)) and its applications (e.g. Examples 14 and 15). 


We begin by considering a tiny square surface element with sides of 
length 6L (Figure 39). The element lies in the ry-plane, with its edges 
aligned with the z- and y-axes, and its unit normal is chosen to be in the 
positive z-direction (towards you). 


We will calculate the circulation of a vector field F around the perimeter of 
this element. To do this, we must first use the right-hand grip rule to 
determine the positive sense of progression around the perimeter. This is 
in the order ABCD, as indicated by arrows in Figure 39. The path ABC'D 
consists of four straight-line segments, which we consider in pairs, starting 
with BC and DA, which vary in the y-direction. 


Line integrals are usually evaluated in parametric form, but these paths 
are simple enough to make this unnecessary. We can revert to the 
fundamental concept of a line integral — that of integrating the component 
of a field in the direction of travel along a path. With the coordinates 
shown in Figure 39, the line integral contributed by the side BC is 


y2 
Ipc _ / Fy (x2, y, 0) dy, 
Y1 
and the contribution from the side DA is 


Y2 
Ipa=— | Fy(x1,y,0) dy. 
Y1 
The minus sign on the right makes good sense because in the limit where 
x1 approaches x2, the paths BC and DA become the reverse of one 
another, and must have opposite signs. The combined contribution from 
sides BC and DA is therefore 


Y2 


Igc+DA = i (Fy (a2, y,0) — Fy(x1,y,0)) dy. 
Y1 


Now, the integrand can be simplified. Dividing and multiplying by 
x2 — x1 = OL, and assuming that the square is very small, we get 


Fi (x2, y, 90) — Fy (#1, y,9 
Fy (x2, y,0) _ F, (#1, y,0) = ee (x2 _ x1) 


OF, 
~ SH OL, 


OF, 


Because the square is assumed to be very small, the integrand can be 
taken to be constant over the range of integration. The integral is then 
approximated by the product of the integrand and the length yz — y, = 6L. 


sO 


y2 
Ipoapa = | 
Y1 


YA 

Y2------ 2 : 

oat ‘B 
xy Ly © 


Figure 39 A square surface 
element in the xy-plane 
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So we have 
OF, 
I ae OL). 
BC+DA ~ (6L) (51) 


A similar calculation can be done for the other two sides of the square. 
Their contribution is 
x2 
TaBicp = / (F(z, y1,0) — Fx (2, y2,0)) dx. 
ry 
In this case, the x-component of the field appears in an integral over 2, 
and the signs are different — the larger value of y (namely y2) now appears 
in the term that carries a minus sign. Bearing these changes in mind, and 
working through the same steps as before, leads to 
v2 OF: OF, 
TAB+CD aed / (- = sL) dx =—_ = (OL). (52) 
@1 Oy Oy 
Finally, combining the contributions from equations (51) and (52), we 
conclude that the circulation of F around the square ABCD is 


: : OF, OF; 
circulation ~ (= ~ By ) 6S, (53) 


where 65 = (dL)? is the area of the square. 


You may recognise the combination of partial derivatives in round 
brackets. Using the definition of curl in Cartesian coordinates in 
equation (49), you can see that 


OF, OF, 
Ox Oy 


Because the square has its unit normal in the z-direction, its oriented area 
is 6S = 6Sk. Combining this with equations (53) and (54), we see that 


circulation ~ (V x F) - 6S. (55) 


This is a remarkable result. It establishes the link we have been seeking 
between curl and circulation. All the approximations made in deriving it 
become exact in the limit where the area of the square shrinks to zero. 
The result has been derived in a particular case, but there is nothing 
special about our choice of axes, or the location of the square. In fact, 
equation (55) applies to all tiny planar elements of any shape. This enables 
us to think about curl in a new way. 


=(VxF),=(V xF)-k. (54) 


Curl as circulation per unit area 


Given a vector field F in the vicinity of a given point, the component 
of V x F in the direction of the unit vector n can be found by taking 
a planar element with unit normal n at the point. The component is 

given by 


(7 ane circulation around perimeter of element 


56 
area of element : (56) 


in the limit where the element becomes very small. 


So each component of the curl at a given point can be interpreted as 
a circulation per unit area at that point. 


5 Circulation and the curl theorem 


Example 13 
Consider the vector field 
F=-yi4+ cj. 


(a) Calculate the circulation of this vector field around the perimeter C of 
a tiny circular element of radius R, centred on the origin and lying in 
the ry-plane. The unit normal of this element is chosen to point in the 
positive z-direction. 


(b) Calculate the z-component of the curl of F at the origin. 
(c) Do your answers to parts (a) and (b) agree with equation (56)? 
Solution 


(a) Using the given direction of the unit normal, the right-hand grip rule 
tells us that the path must be traversed anticlockwise (when viewed 
from the positive z-axis). This path can be represented by parametric 
equations of the form 


s=Reost, g=Asmt (02 ¢< 20). 


We have 
F=-yi+2j=-—Rsinti+ Reostj, 
OF a ie each 
—_— — = Nnv1 
dt — dt dt) . ae 
Se) 
ds dx dy 
F-—=F,—+Ff,— 
da a at 
= R? (sin? t + cos? t) 
=i. 


The circulation around C is therefore 


2 
fv-as=[ R? dt 
Cc 0 


= 20 R?. 
(b) At any point, the z-component of the curl of F is 
OF, OF, 
VY xPFisk=—_- =] 
ene) Ox Oy 
_ ale) _ a(-y) 
Ox Oy 
= 2. 


In particular, (V x F)-k = 2 at the origin. 


(c) Because the area of the circular element is 7R?, the right-hand side of 
equation (56) is equal to 27R?/mR? = 2. This is equal to the left-hand 
side, calculated at the centre of the element. 
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Figure 40 
circulation 
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Exercise 21 


Compare the two sides of equation (56) for a vector field F = x?j anda 
square element in the xy-plane with corners A, B, C and D at (z,y), 
(c+a,y), (c©+a,y+a) and (x,y +a), respectively, where a is a small 
constant length. The unit normal of the element is taken to be in the 
positive z-direction. 


5.3 Additivity of circulation and the curl theorem 


You have seen that the divergence of a vector field at a given point can be 
interpreted as the flux per unit volume in a tiny region around the point. 
Moreover, the additivity of flux allowed us to derive a more powerful result 
— the divergence theorem — which applies over extended regions. Now we 
will do something similar for curl. 


The additivity of circulation 


Consider an open surface in the plane of the page, divided into a number 
of subregions. As always, the unit normals of the subregions are required 
to have consistent orientations. To consider a definite case, we take them 
to point out of the page towards you. Then the right-hand grip rule 
ensures that the perimeters of the subregions are all traversed in the same 
sense — in this case, anticlockwise. 


The additivity of circulation relates the circulation of a vector field around 
the surface to the sum of its circulations around the subregions. 


The additivity of circulation 


If an open surface is subdivided into consistently-oriented surface 
elements, the circulation of a vector field F around the perimeter of 
the surface is the sum of the circulations of F around the perimeters 
of all the elements. 


To see why this is true, look at Figure 40; this shows two neighbouring 
subregions AS; and AS», with perimeters C; (in red) and C2 (in blue). 
These perimeters share a common segment AB, which is traversed in one 
sense for C; and in the opposite sense for C2. So when we add the 
circulations around Cy and C2, the contributions from the common section 
AB cancel out. More generally, all the contributions from boundaries 
between subregions cancel out, leaving only contributions from sections 
that are not shared. But these non-shared sections form the perimeter of 
the whole surface. 


5 Circulation and the curl theorem 


In fact, the surface need not be flat. All we need is an open surface, such 
as that in Figure 41, divided into surface elements. (Recall that an open 
surface is one that has a perimeter.) If the surface elements are oriented 
consistently — with neighbouring elements having similar, rather than 

opposing, unit normals — the additivity of circulation continues to apply. 


The curl theorem 


Finally, we can combine the additivity of circulation with the 
interpretation of curl as circulation per unit area. Suppose that we want to 
find the circulation of a vector field F around the perimeter C of an open 
surface. We can divide the surface into many tiny surface elements S; with 
perimeters C;. The additivity of circulation tells us that 


¢ F-ds= S "(circulation around C;), 

C i 

where the sum is over the perimeters of all the surface elements that make 
up the surface. 


The surface elements are assumed to be tiny, so we can use equation (55) 
to express each circulation in terms of curl. This gives 


f Beds ~ S(V xB) 58. 
CG ; 


v 
In the limit where the surface elements approach zero size, the 
approximation becomes exact and the right-hand side becomes an integral. 
We arrive at the following important result. 


Curl theorem 


If F is a vector field and S is an open surface with perimeter C’, then 


f¥-as= [iv x8)-as. (57) 


The curl theorem is just as important as the divergence theorem, and plays 
a central role in electromagnetism and fluid mechanics. 


Origins of the curl theorem 


The curl theorem is often called Stokes’s theorem after the 
mathematician George Stokes (Figure 42), although the connection 
with Stokes is rather shaky. 


The theorem was actually discovered by Lord Kelvin in 1850. Stokes 
learned about it in a letter from Kelvin, and set an exam question 
asking students to prove it. Exams must have been tough in those 
days! One of the students taking the exam was James Clerk Maxwell, 
who went on to use the theorem to help to frame the fundamental 
laws of electromagnetism. Stokes himself is famous for his work on 
fluid mechanics, wave motion and optics. 


Figure 41 The additivity of 


circulation on a curved 
surface 


This is the key result of this 


section. It links line integrals to 


related surface integrals. 


Figure 42 George Stokes 
(1819-1903) 


251 


Unit 10 Integrating scalar and vector fields 


Figure 43. Two surfaces 
with the same perimeter path 
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The curl theorem is useful because it can simplify the evaluation of 
integrals. For example, we can convert a line integral around a closed path 
into a simpler surface integral, as shown in the following example. 


Example 14 


Use the curl theorem to find the line integral of F = —yi+2j around a 
circular path C' in the ry-plane, centred on the origin and of radius R. The 
path is traversed anticlockwise when seen from the positive z-axis. 


Solution 
The curl of the two-dimensional vector field F is 
OF, OF y O(z) O(-y) 
VxF=(—- k= [{[—- k = 2k. 
oe oad ae 


The path C’ is the perimeter of a circular disc of radius R, centred on the 
origin and in the xy-plane. Since C' is traversed in an anticlockwise sense, 
the right-hand grip rule shows that the unit normal of this surface is k 
(rather than —k). 


Hence, using the curl theorem, the required line integral is 


[F-a= | 2k -kdS = 2nR?, 
Cc disc 


which agrees with the calculation in Example 13(a). 


Exercise 22 


Use the method of Example 14 to calculate the line integral of the vector 
field F = x?i+ y?j around a rectangular path in the xy-plane with corners 
at (0,0), (2,0), (2,1), (1,1), traversed in that order, and returning to (0,0). 


It is worth noting that the curl theorem applies to all open surfaces, 
whether flat or not. In Figure 43, the surfaces S; and S» share the same 
perimeter path C’. In this case, the curl theorem tells us that 


[Pea (VxPF)-as= [ (VxF)-4s, 
C Si So 


for any vector field F. 
So if the vector field G is the curl of F (so that G = V x F), we have 


| eus- G-ds, (58) 
Sy So 


for any open surfaces S; and Sy that share the same perimeter path. 


This gives us the freedom to replace the surface integral of a vector field G 
over a complicated surface by one over a much nicer surface — but only if G 
is a curl field (i.e. a field that is the curl of another vector field). This is 
reminiscent of the freedom that we have to adjust the paths of gradient 
fields between fixed start and end points. 


5 Circulation and the curl theorem 


Example 15 


The vector field G = z?7j 4+ 2?k is a curl field. Use this fact to find the 
surface integral of G over the curved hemispherical surface S' in Figure 44, 
with its unit normals pointing upwards as shown. 


Solution 


The perimeter of the hemispherical surface is a circle in the xy-plane, 
centred on the origin and of radius R. The circular disc at the base of the 
hemisphere shares this perimeter path. The curl theorem can be applied to 
both these surfaces, but to ensure that the perimeter path is traversed in 
the same sense in both cases, the unit normal of the disc must be chosen to 
be k (rather than —k). Because G is a curl field, we can replace the 
surface integral over the hemisphere by one over the disc. 


Hence 


[o-as-=/ G-dS= (2?jta’k)-kdS= | 2’d8. 
S disc disc disc 

So we just need to integrate x? over a circular disc. Using polar 
coordinates, we get 


o=20 r=R 
[a-as=/ (/ reos? 6x rdr) dé 
S o=0 r=0 
Qn R 
=| cos? do x | r? dr 
0 0 


= 1p4 __1,p4 
=17xX GR = 47h’. 


Exercise 23 


The vector field G = —2xzi+ (x? + z?)k is a curl field. Use the method of 
Example 15 to calculate the surface integral of G over the curved surface of 
the bell shown in the margin. The open mouth of this bell is a circle in the 
xy-plane, centred on the origin and of radius R. The body of the bell lies 

in the region z > 0, and its unit normals point in the sense of increasing z. 


Finally, there is some unfinished technical business. Subsection 3.3 proved 
that every conservative field has zero curl. The curl test assumes that the 
converse is true; so if the curl of F is equal to zero throughout its domain, 
it assumes that F is conservative. This is clearly unsafe logic: every owl is 
a bird, but every bird is not an owl! The curl test is usually reliable, but 
there is a proviso that can now be explained. 


Suppose that F is defined throughout the whole of space, and that 
V x F =0 everywhere. Then for any closed path C, the curl theorem tells 
us that 


[F-a= [iv x¥)-a8=0, 
Cc S 


where S is an open surface with C as its perimeter. 


You can check that G = V x F, 
where F = $(z?i+ 23j). 


Figure 44 A hemispherical 
surface 
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B 


A 


Figure 45 ‘Two paths; 
reversing a path reverses the 
sign of a line integral 


Rv 


Ci 


Figure 46 A case in which 
the domain of a 
two-dimensional vector field 
F(z, y) excludes the origin 
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Under these circumstances, we can say that F has zero circulation around 
any closed loop C. It is easy to see that this implies that all the line 
integrals of F are path-independent. For example, Figure 45 shows red and 
blue paths leading from A to B. The line integral of F around a closed 
path C' that follows the red path and the reverse of the blue path is 


[P-a= | F-ds— | F -ds. 
C AB (red) AB (blue) 


If this is equal to zero, the line integrals along the red and blue paths must 
be equal. This independence of path allows us to conclude that F is 
conservative, and almost proves the curl test — but there is a loophole. 


To see what can go wrong, consider the two-dimensional situation shown 
in Figure 46, where the vector field F is not defined at the origin. If we 
consider a closed loop C; that encircles the origin, we can see that this is 
part of the boundary of an open surface S within the domain of F. But it 
is not the complete boundary — there is also another portion C2, closer to 
the origin. Now the fact that V x F = 0 tells us that the circulations 
around C; and C2 cancel one another out — but they might not 
individually vanish. Under these circumstances, the curl test fails. In 
technical language, the curl test requires the domain of the field to be 
simply-connected. 


A region is simply-connected if any closed loop in the region can be 
continuously shrunk to a point without leaving the region. For example, 
the whole of space is simply-connected. A typical Swiss cheese (with 
isolated holes) is also simply-connected. However, a plane with the origin 
removed, and three-dimensional space with the z-axis removed, are not 
simply-connected. In such domains, the condition V x F = 0 does not 
guarantee the path-independence of all line integrals, so the curl test fails. 


Learning outcomes 


After studying this unit, you should be able to do the following. 
e Calculate the line integral of a scalar field in Cartesian coordinates. 
e Calculate the line integral of a vector field in Cartesian coordinates. 


e State and apply the properties of conservative fields. Simplify line 
integrals involving conservative fields by choosing appropriate paths. 


e Calculate the conservative vector field corresponding to a given scalar 
potential field, and find a scalar potential field corresponding to a 
given conservative vector field. 


e Use the curl test to decide whether or not a given vector field is 
conservative. 


e Define the terms closed surface, open surface, flux and oriented area, 
and understand the convention for the unit normals of a closed surface. 


e Calculate the surface integral (or flux) of a vector field over a given 
surface (in simple cases). 


e Interpret divergence as flux per unit volume. State and apply the 
additivity of flux and the divergence theorem. 


e Use the right-hand grip rule to find the positive sense of progression 
around a given closed loop. Define and calculate circulation of a 
vector field. 


e Interpret curl as flux per unit area. State and apply the additivity of 
circulation and the curl theorem. 


Appendix: two insights 


This Appendix is for interest and enjoyment only. It will not help you 
with calculations, but it contains two interesting insights that unify 


different topics in this book. You can read it when you have the time 
(possibly after studying the module). 


Unifying various types of integral 


In ordinary calculus, we can say that 


b 
| Zee=10)-s0, (59) 


a result known as the fundamental theorem of calculus because it 
brings together derivatives and integrals. 


A similar result applies to gradients and line integrals: 
VU -ds=Upg—U 4. (60) 
A->B 


This is the content of equations (24) and (26), although the sign 
convention relating F and U inserted minus signs in those equations. 
Equation (60) is sometimes called the gradient theorem. 


Two other important results relating derivatives and integrals were 
discussed in this unit. The curl theorem can be written as 


[ev x¥)-d8= f Fas, (61) 


and the divergence theorem can be written as 


[vera = [F-as, (62) 


There is a feature that unifies these four theorems. In each case, the 
left-hand side involves something that is differentiated and then integrated 
over a region, while the right-hand side contains no derivative, and is 
formed from values on the boundary of the region. 


Appendix: two insights 
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Figure 47 A volume element 
in (u,v, w) coordinates 
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e In equation (59), the region is an interval along the z-axis, and its 
boundary is the pair of points = a and x = b at either end of the 
interval. 


e In equation (60), the region is a curved path, and its boundary 
consists of the pair of points A and B at either end of the path. 

e In equation (61), the region is an open surface S, and its boundary is 
the closed path C that forms its perimeter. 

e In equation (62), the region is a volume V, and its boundary is the 
surface S' of this volume. 


From this perspective, all of these theorems belong to the same family. 


Expressions for divergence and curl in orthogonal coordinates 


Unit 9 gave general formulas for divergence and curl in orthogonal 
coordinates. The optional Appendix of Unit 9 justified these formulas in a 
direct way, but it involved lengthy calculations. The divergence and curl 
theorems allow us to give alternative justifications that are simpler and 
more attractive. 


According to Unit 9, in any orthogonal coordinate system (u,v, w), 
divergence is given by 


LV Of DE, 0 (JF, O (Jk 
-F=—|— =— — | — 63 
me lea) +e) tae): a 
where h,,, hy and h, are scale factors, and J = hyhyhy is the Jacobian 
factor. 


The divergence theorem tells us that divergence can be interpreted as flux 
per unit volume. We can show that equation (63) is a direct expression of 
this fact. To see why, look at Figure 47, which shows a small volume 
element for the (u,v, w) orthogonal coordinate system. The volume of this 
element is 


OV = hyhyhy du dv dw. (64) 


We need to calculate the flux of a vector field F over the surface of the 
volume element. First, consider the two curved faces A (red) and B 
(green), which are perpendicular to the u-axis. 


The calculation goes along the same lines as that given in Subsection 4.3 
for Cartesian coordinates, but there is one significant difference. The faces 
A and B are generated by the same increments dv and dw, but they may 
have different areas. We must take account of this when we evaluate the 
fluxes. The area of each face is 


6A = (hy bv) Xx (hy dw), (65) 
and the faces have different areas if the scale factors h, and hy depend 
on u. Not surprisingly, equation (40) is replaced by 


O(F,, 5A) 
Ou 


flux over (A+ B) = ou, 


where the area of a face is now inside the partial derivative. 


Using equation (65) and recalling that dv and dw are the same for both 
faces, we get 


flux over (A+ B) = 


OF Ryliy) oO (IF, 
a, Ou dv ow — ( hh, ) bude du 


Of course, there are similar expressions for the other two pairs of faces, 
perpendicular to the v- and w-axes. Adding together all these fluxes gives 
the total flux over the surface of the volume element: 


Of JF y O (dk, O (JFu 
total flux = Fa = ) + = he ) + = (52) dudvdw. (66) 


Divergence is flux per unit volume. We therefore divide equation (66) by 
equation (64), and take the limit of a tiny volume element. In this limit, 
all our approximations become exact, and we recover equation (63). 


We can also justify the formula for curl in orthogonal coordinates. In any 
right-handed orthogonal coordinate system (u,v, w), with scale factors hy, 
hy and hy, Unit 9 gave the following formula for curl: 


hueu Ayey hw ew 
1 O ) O 
Ryhyhy | Qu Ov Ow 
huFu hePy ho Fw 
Figure 48 shows a tiny area element based on the orthogonal coordinates 
(u,v,w). This element is perpendicular to the w-axis, and we can obtain 


an expression for the w-component of V x F by finding the circulation per 
unit area of F around it. 


VxF= (67) 


The calculation goes along similar lines to that given in Subsection 5.2 for 
Cartesian coordinates. The main new feature is that opposite sides of the 
element need not be equal in length. Their lengths depend on scale factors, 
which may vary from point to point. Taking this into account, it is not 
difficult to show that the circulation around the element in Figure 48 is 


O(hy Ly) ae Susu 


Ou Ov 


while the area of the element is 


6A = hyhy du dv. 


circulation = ( (68) 


Curl is circulation per unit volume. We therefore divide equation (68) 
by 6A, and take the limit of a tiny area element. In this limit, all our 
approximations become exact, and we recover the w-component of 
equation (67). Similar arguments give the other two components of the 
curl. 


Appendix: two insights 


Ss 


hy Ov 


hy, Ou 


Figure 48 A surface element 


in the (u,v, w) coordinate 
system 


257 


Unit 10 Integrating scalar and vector fields 
Solutions to exercises 


Solution to Exercise 1 
We have da/dt = 2 and dy/dt = 2t, so 


& 3 dy "Pa? 40 4) 
dt dt : 
The length of the parabolic arc is 
2 dy\? 1 
+ (4) ar=2 | V1+t? dt. 
-1 
Using the standard integral given in the question, we conclude that 
t=1 
i= liv +2 +4+in(t+V1+ 2) 
t=-1 
= 2/2 + In(vV/2 + 1) — In(V2 — 1) ~ 4.59. 


Solution to Exercise 2 


We have 
dx . dy dz 
a asint, Fe Boosts at b, 

SO 
da \? dy” dz\? 9 2. 12 
— — —]} =(-asint b 
(=) + (4) + (=) (—asint)“ + (acost)* + 

= ab: 


The length of the helical path is therefore 


27 
L= | Va + b? dt = 2rv/ a? + b?. 
0 


Check: In the limit where b = 0, our answer reduces to 27a, which is the 
circumference of a circle, as expected. 
Solution to Exercise 3 


The total number of ants is given by the line integral 


N= f dat 
C 


From the given parametric representation, we have 
A A 
A(a(t), y(e)) = Rxalk cos t)?(Rsint) = R cos* t sin t. 


Also, 
dz 
dt 

so 


él ~ «/(—Rsint)? + (Reost)? dt = Rot. 


d 
= —Rsint, Oo Reost, 
dt 
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Hence 
™ A den 2 T bats 
N= —cos*tsint x Rdt=A cos* tsint dt. 
o Rk 0 


The integral can be completed by substituting u = cost. Then 
du/dt = —sint. The lower limit t = 0 corresponds to u = cos0 = 1, and 
the upper limit t = 7 corresponds to u = cos7 = —1, so 


bale du 
N=A a een 
fo" ( i) ¢ 
u=-1 
--4/ wdu= —A[L3] "1 = 24. 
u=l1 


Solution to Exercise 4 
We have dr/dt = 2 and d¢/dt = 1, so 


5 5 
L= | V4+4+ 4? x 1ldt= 2 | V1+t? dt. 
0 0 
Using the standard integral given in Exercise 1, we get 


L=|tVi+e+n(t+ VIF?) 
= 5/26 + In(V26 + 5) ~ 27.8. 


5 

0 

Solution to Exercise 5 

Along path A, d0/dt = 1 and d¢/dt = 0, so equation (14) gives 


m/2 T 
b=Rf Vidt = =R. 
: 2 


Along path B, d0/dt = 0, d¢/dt = 1 and sin@ = sin(7/6) = 5, so 


m/2 7 
= 1 Gt = — 
L=R f (iat ="R. 


Solution to Exercise 6 


Differentiating the parametric equations x = 2—t and y = t gives 
a re 
dt dt 

Expressing the components of F in terms of t, we get 
F, =x£—y=(2-t) -t=2(1-1), 
F,=e¢+ysO-f4+t=2, 


So 


1. 


The required line integral is 


2 
[ Fa =F), =4 
Co 0 
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Solution to Exercise 7 


Differentiating the parametric equations gives 
d. d d 
a 
dt dt dt 

Expressing the components of F in terms of t, we get 
f= ye=414+2t), FpaHee=4, Fe =ay= ti + 2t). 

Hence 

ds 
dt 

The required line integral is therefore 


F-— =4(1+2t)+8t=4 +4 16t. 


1 
[rae (4+ 16¢) dt = [4¢ + 847], = 12. 
Cc 0 


Solution to Exercise 8 


Differentiating the parametric equations gives 
d d dz 
ee Se. Si, 
dt dt dt 
Expressing the components of F in terms of t, we get 


Pej yse=43=2)).. FH weaH41—4), Fo =ay= 01-6 = 22). 


Hence 
ds 
Ee 4(3 — 2t) — 8(1 — t) = —20 + 16¢. 


The required line integral is therefore 


1 
| F-ds= | (—20 + 16¢) dt = [—20¢ + 817], = —12, 
rev 0 
which is minus the answer of Exercise 7, as expected. 


Solution to Exercise 9 
Because F is a gradient field, any line integral with start point (1,1) and 
end point (7,3) has value 


, F- ds =U(1,1) —U(7,3) =0— (—20) = 20. 
C 


Solution to Exercise 10 


Because the field is conservative, we can choose any convenient path with 
the given start and end points. A simple choice is the straight-line path C 
from the origin (0,0) to (1,2). Along this path y = 22, so a suitable 
parametrisation is 


f=t, gyS2i <7 < 1), 
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Then we have dz/dt = 1 and dy/dt = 2, so 
ds dx dy 
F-—=F,— — = (3t? x 2t) x 14 (#2 + 8t9) x 2 
di erage Or eee 
= 6t? + 18¢° = 24¢°. 


So the line integral is 


1 
[Fas | 24t? dt = 6. 
Cc 0 


Solution to Exercise 11 


The scalar potential field is given by 


Ue)=- [Peas 


We consider an arbitrary point r = (a,b), and choose a straight-line path 
from the origin to this point. This path can be described by the 
parametric equations 


to, gem (02 = 1), 


The values of a and 6 are constant along the path, so 


« =a and “ =b 
Hence 
d 
F = = “ F, 4 = acos(at) + bsin(bt) 
and 


U(a,b) = — Ze (acos(at) + bsin(bt)) dt 


= [sin(at) — cos(bt) | 


= cosb— sina — 1. 


t=1 

t=0 

However, the point (a,b) is arbitrary, so for any point (2, y), 
U(x,y) = cosy — sing — 1. 

This answer can be checked by taking its gradient: 


Si i cea 
Ox Oy 
Solution to Exercise 12 
(a) We have 
i j k 
0 dO O ; : = 


y “L 2 


So the curl test shows that F is conservative. 
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From equation (38), 
k-e, =cos@. 
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(b) We have 
ij k 
0 0 Od : ; 
—y &f 2 


This is not equal to 0 everywhere, so G is not conservative. 


Solution to Exercise 13 


For element A, the unit normal is i, so the normal component of the field 
is i- F = 2. This element has area 2, so the flux over it is equal to 4. 


For element B, the unit normal is —j, so the normal component of the field 

is —j- F = —3. This element has area 1, so the flux over it is equal to —3. 

Solution to Exercise 14 

The coordinate transformation equations for spherical coordinates are 
x=rsinOcos¢, y=rsindsing, z=rcosé. 

So on the curved surface of the hemisphere, where r = R, the vector field is 
F = zk= Rcoos6k. 


The unit normals shown in the figure point in the same outward direction 
as those for a complete sphere, so the formula for J in equation (37) can 
be used for the hemisphere. We therefore have 


F - J = (Rcos@k) - (R’ sin de,) 
= R* cos sink - e, 
= R° cos” Osin 0. 
The surface of the hemisphere is defined by 0 < 0 < 1/2 and 0 < @< 2z, 
so the required flux is 


p=20 0=1/2 
[e-as-8 | (/ cs sin) dd. 
S o=0 6=0 


To carry out the integral over 6, we make the substitution u = cos@. Then 
du/d@ = —sin@. The limit 6 = 0 corresponds to u = 1, and the limit 
6 = 7/2 corresponds to u = 0, so 


0=1/2 0=1/2 a 
| cos? @sin 6 do = i u2 (-5) do 
6=0 6=0 dé 


Solution to Exercise 15 

Following the same method as in Exercise 14, we have 
F=3k and J=R’sinée,, 

sO 
F-J = 3R’sindk-e, = 3R? sin 6 cos@ = 2R? sin(26). 


The flux over the hemispherical surface is 


b=27 0=7/2 
| F-dS = 3R? | | sin(20) d0 | dé 
S p=0 0=0 


een n/2 
- ye | [—$ cos(20)]f"" do 
o=0 


27 
= pe | 1d = 3nR?. 


Solution to Exercise 16 


The divergence of F is 


Ale) Ay) , a2) 
Ve = ae ag ae 


Using the divergence theorem, the required surface integral is 


[¥-as- | v-Fav = [ sav, 
Ss V V 


where the volume integral is over the volume of a sphere of radius R. 
Hence 


= 3. 


[8-8 = 3x fon? = an’, 
S 


Solution to Exercise 17 


As in Example 12, we consider the whole surface of the hemisphere, S, 
which includes the curved dome S; and the flat base Sy. Then 


| ¥-as- P-ds+ | F -dS. 
S Sy So 


On the flat base of the hemisphere, z = 0, so in this case 


| F-dS=0 
So 


and 


| P-ds= /[ p-as= [ V -FdvVv, 
Sy S Vv 


where V is the region bounded by the hemispherical surface and its base. 


Solutions to exercises 


263 


Unit 10 Integrating scalar and vector fields 


The divergence of F = zk is 
A(0) | A(0) | Az) 


Cis Sg 2 gy 
on Oy 82 


sO 


: F-dS= [ 1av=3nR°, 
Sy Vv 


where we have used the fact that a hemisphere of radius R has half the 
volume of a complete sphere (i.e. ‘ x $nR°). The answer agrees with that 
of Exercise 14. 


Solution to Exercise 18 


(a) In order for the flow to be steady-state, we must have V - (pv) = 0. In 
this case, 


O(2y”) ” O(—14yz) i O(7z7) 
Ox Oy Oz 
so this can be a steady-state flow. 


(b) In this case, 


V - (pv) = 


=0- 147+ 14z =0, 


O(2x) | O(—3y) 2 O(4z) 
Ox Oy Oz 


so this cannot be a steady-state flow. 


V - (pv) = =2-—3+4=30, 


Solution to Exercise 19 


The unit normals of a closed surface all point outwards into the exterior 
space. Using the right-hand grip rule, we see that the perimeters of B and 
C are traversed in a positive sense, and the perimeters of A and D are 
traversed in a negative sense. 


Solution to Exercise 20 


(a) We have 
OS eg st aia 
—— —j=-sin 
di dt dt? ee 
and 
F = costj. 
Hence 


d 
F. - = cos” us 
and the circulation of F around C is 


27 27 
¢ Peds = [ cos? tdt = | (1+ cos(2t)) dt = 7. 
c 0 0 


(b) The line integral of a conservative field around a closed path is equal 
to zero, so the circulation of G around C is equal to zero. 
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Solution to Exercise 21 
The left-hand side of equation (56) is (V x F)-k, which is equal to 
OFy OF, _ O(a?) — A(0) 


Oz Oy —s- Ox Oy 
With the unit normal in the positive z-direction, the perimeter of ABCD 
must be traversed in an anticlockwise sense, as shown in the diagram. The 
field acts in the y-direction, so only sides BC and DA contribute to the 
circulation. Along side BC, the component of the field in the direction of 
the path has the constant value (2 + a)?. When this is integrated along 
the length of BC, it makes a contribution (x + a)?a to the circulation. 
Along side DA, the component of the field in the direction of the path has 
the constant value —x?. When this is integrated along the length of DA, it 
makes a contribution —x?a to the circulation. So the total circulation of F 
around ABCD is 


= 25% 


2 


circulation = (a + a)°a — 27a = (x* + 2ax 4+ a*)a— 2? 


a = 2a7r+°. 
The area of the square is a”, so 


2a7x + a3 


circulation per unit area = 5 = 2+ a. 
a 


In the limit where a tends to zero, the left- and right-hand sides of 
equation (56) become equal, as required. (Note that in general, 
equation (56) applies only in the limit of a vanishingly small element.) 


Solution to Exercise 22 


Using the curl theorem, the required line integral can be calculated from 
the surface integral of V x F over a rectangular area. However, we have 


i j k 

0 dO O 
VRE= 5 ay a 

sy «(0 


so the surface integral, and hence the required line integral, is equal to 
Zero. 


Solution to Exercise 23 


Because F is a curl field, we can replace the curved surface of the bell by 
the flat circular disc at its mouth. To ensure that both surfaces have the 
same perimeter path, traversed in the same sense, the unit normal of the 
disc must be taken to be k (rather than —k). We therefore get 


G-as= | G-dS= (—2uzi+ (a? + 27)k)-kdS. 
bell disc disc 
The disc lies in the xy-plane, so z = 0. Hence the integral reduces to 


G-dS= | x dS. 
bell disc 


This integral was evaluated in Example 15, so the answer is tR*/4. 


Solutions to exercises 


YA 
Yy + A4---- D C 
ie B 
Lr z a z 


Although it is not part of the 
question, you could check that 
G=V x F, where 

F = —27yi + xz7j. 
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